1This is mailfromd.info, produced by makeinfo version 6.7 from 2mailfromd.texi. 3 4Published by the Free Software Foundation, 51 Franklin Street, Fifth 5Floor, Boston, MA 02110-1301 USA 6 7 Copyright (C) 2005-2021 Sergey Poznyakoff 8 9 Permission is granted to copy, distribute and/or modify this document 10under the terms of the GNU Free Documentation License, Version 1.3 or 11any later version published by the Free Software Foundation; with no 12Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A 13copy of the license is included in the section entitled "GNU Free 14Documentation License". 15INFO-DIR-SECTION Email 16START-INFO-DIR-ENTRY 17* Mailfromd: (mailfromd). General-purpose mail-filtering software. 18* mailfromd: (mailfromd) Invocation. Mail Filtering and Real-time Modification daemon. 19* calloutd: (mailfromd) calloutd. A Stand-Alone Callout Daemon. 20* mfdbtool: (mailfromd) mfdbtool. Database Management Tool. 21* mtasim: (mailfromd) mtasim. MTA simulator. 22* pmult: (mailfromd) pmult. Pmilter multiplexer program. 23END-INFO-DIR-ENTRY 24 25 26 27 28 29 30 31 32 33 34 35 Dedico aquest treball a Lluis Llach, per obrir els nous horitzons. 36 37 38 39 40File: mailfromd.info, Node: Top, Next: Preface, Up: (dir) 41 42Mailfromd 43********* 44 45This edition of the 'Mailfromd Manual', last updated 15 February 2021, 46documents 'mailfromd' Version 8.10. 47 48* Menu: 49 50* Preface:: Short description of this manual; brief 51 history and acknowledgments. 52* Intro:: Introduction to Mailfromd. 53* Building:: Building the Package. 54* Tutorial:: Mailfromd Tutorial. 55* MFL:: The Mail Filtering Language. 56* Library:: The MFL Library Functions. 57* Using MFL Mode:: Using the GNU Emacs MFL Mode. 58* Mailfromd Configuration:: Configuring 'mailfromd'. 59* Invocation:: How to Start and Stop 'mailfromd'. 60* MTA Configuration:: Using 'mailfromd' with Various MTAs 61* calloutd:: A Stand-Alone Callout Daemon. 62* mfdbtool:: A Database Management Tool. 63* mtasim:: An MTA simulator. 64* pmult:: Pmilter multiplexer program. 65* Reporting Bugs:: How to Report a Bug. 66 67Appendices 68 69* Gacopyz:: 70* Time and Date Formats:: 71* Upgrading:: 72 73* Copying This Manual:: The GNU Free Documentation License. 74* Concept Index:: Index of Concepts. 75 76 -- The Detailed Node Listing -- 77 78Preface 79 80* History:: Short 'mailfromd' history. 81* Acknowledgments:: Acknowledgments. 82 83Introduction to 'mailfromd' 84 85* Conventions:: Typographical conventions. 86* Overview:: Mailfromd at a first glance 87* SAV:: Principles of Sender Address Verification. 88* Rate Limit:: Controlling Mail Sending Rate. 89* SPF:: SPF, DKIM, and others. 90 91Sender Address Verification. 92 93* Limitations:: 94 95Tutorial 96 97* Start Up:: 98* Simplest Configurations:: 99* Conditional Execution:: 100* Functions and Modules:: 101* Domain Name System:: 102* Checking Sender Address:: 103* SMTP Timeouts:: 104* Avoiding Verification Loops:: 105* HELO Domain:: 106* rset:: 107* Controlling Number of Recipients:: 108* Sending Rate:: 109* Greylisting:: 110* Local Account Verification:: 111* Databases:: 112* Testing Filter Scripts:: 113* Run Mode:: 114* Logging and Debugging:: 115* Runtime errors:: 116* Notes:: 117 118Databases 119 120* Database Formats:: 121* Basic Database Operations:: 122* Database Maintenance:: 123 124Run Mode 125 126* top-block:: The Top of a Script File. 127* getopt:: Parsing Command Line Arguments. 128 129Mail Filtering Language 130 131* Comments:: Comments. 132* Pragmas:: Pragmatic comments. 133* Data Types:: 134* Numbers:: 135* Literals:: 136* Here Documents:: 137* Sendmail Macros:: 138* Constants:: 139* Variables:: 140* Back references:: 141* Handlers:: 142* begin/end:: 143* Functions:: Functions. 144* Expressions:: Expressions. 145* Shadowing:: Variable and Constant Shadowing. 146* Statements:: 147* Conditionals:: Conditional Statements. 148* Loops:: Loop Statements. 149* Exceptions:: Exceptional Conditions and their Handling. 150* Polling:: Sender Verification Tests. 151* Modules:: Modules are Collections of Useful Functions. 152* Preprocessor:: Input Text Is Preprocessed. 153* Filter Script Example:: A Working Filter Script Explained. 154* Reserved Words:: A Reference List of Reserved Words. 155 156Pragmatic comments 157 158* prereq:: Pragma prereq. 159* stacksize:: Pragma stacksize. 160* regex:: Pragma regex. 161* dbprop:: Pragma dbprop. 162* greylist:: Pragma greylist. 163* miltermacros:: Pragma miltermacros. 164* provide-callout:: Pragma provide-callout. 165 166Constants 167 168* Built-in constants:: 169 170Variables 171 172* Predefined variables:: 173 174Functions 175 176* Some Useful Functions:: 177 178Expressions 179 180* Constant expressions:: String and Numeric Constants. 181* Function calls:: A Function Call is an Expression. 182* Concatenation:: String Concatenation. 183* Arithmetic operations:: '+', '-', etc. 184* Bitwise shifts:: '<<' and '>>'. 185* Relational expressions:: '=', '<', etc. 186* Special comparisons:: 'matches', 'mx matches', etc. 187* Boolean expressions:: 'and', 'or', 'not'. 188* Precedence:: How various operators nest. 189* Type casting:: 190 191Statements 192 193* Actions:: Actions control the handling of the mail. 194* Assignments:: 195* Pass:: 196* Echo:: 197 198Exceptional Conditions 199 200* Built-in Exceptions:: 201* User-defined Exceptions:: 202* Catch and Throw:: 203 204Modules 205 206* module structure:: Declaring Modules 207* scope of visibility:: 208* import:: Require and Import 209 210The MFL Library Functions 211 212* Macro access:: 213* String transformation:: 214* String manipulation:: 215* String formatting:: 216* Character Type:: 217* Email processing functions:: 218* Envelope modification functions:: 219* Header modification functions:: 220* Body Modification Functions:: 221* Message modification queue:: 222* Mail header functions:: 223* Mail body functions:: 224* EOM Functions:: 225* Current Message Functions:: 226* Mailbox functions:: 227* Message functions:: 228* Quarantine functions:: 229* SMTP Callout functions:: 230* Compatibility Callout functions:: 231* Internet address manipulation functions:: 232* DNS functions:: 233* Geolocation functions:: 234* Database functions:: 235* I/O functions:: 236* System functions:: 237* Passwd functions:: 238* Sieve Interface:: 239* Interfaces to Third-Party Programs:: 240* Rate limiting functions:: 241* Greylisting functions:: 242* Special test functions:: 243* Mail Sending Functions:: 244* Blacklisting Functions:: 245* SPF Functions:: 246* DKIM:: 247* Sockmaps:: 248* NLS Functions:: 249* Syslog Interface:: 250* Debugging Functions:: 251 252Message Functions 253 254* Header functions:: 255* Message body functions:: 256* MIME functions:: 257* Message digest functions:: 258 259Interfaces to Third-Party Programs 260 261* SpamAssassin:: 262* DSPAM:: 263* ClamAV:: 264 265DSPAM 266 267* flags-dspam:: DSPAM Operation Modes and Flags. 268* class-dspam:: DSPAM Class and Source Bits. 269* vars-dspam:: DSPAM Global Variables. 270 271Note on interation of dkim_sign with MMQ 272 273* Setting up a DKIM record:: 274 275Configuring 'mailfromd' 276 277* conf-types:: Special Configuration Data Types 278* conf-base:: Base Mailfromd Configuration 279* conf-server:: Server Configuration 280* conf-milter:: Milter Connection Configuration 281* conf-debug:: Logging and Debugging configuration 282* conf-timeout:: Timeout Configuration 283* conf-callout:: Call-out Configuration 284* conf-priv:: Privilege Configuration 285* conf-database:: Database Configuration 286* conf-runtime:: Runtime Constants 287* conf-mailutils:: Standard Mailutils Statements 288 289'Mailfromd' Command Line Syntax 290 291* options:: Command Line Options. 292* Starting and Stopping:: How to Start and Shut Down the Daemon. 293 294Command Line Options. 295 296* Operation Modifiers:: 297* General Settings:: 298* Preprocessor Options:: 299* Timeout Control:: 300* Logging and Debugging Options:: 301* Informational Options:: 302 303Using 'mailfromd' with Various MTAs 304 305* Sendmail:: 306* MeTA1:: 307* Postfix:: 308 309'calloutd' 310 311* config-calloutd:: Calloutd Configuration. 312* invocation-calloutd:: Calloutd Command-Line Options. 313* protocol-calloutd:: The Callout Protocol. 314 315Calloutd Configuration 316 317* conf-calloutd-setup:: 'calloutd' General Setup. 318* conf-calloutd-server:: The 'server' Statement. 319* conf-calloutd-log:: 'calloutd' Logging. 320 321'mfdbtool' 322 323* Invoking mfdbtool:: 324* Configuring mfdbtool:: 325 326'mtasim' -- a testing tool 327 328* interactive mode:: 329* expect commands:: 330* traces:: 331* daemon mode:: 332* command summary:: 333* option summary:: 334 335Pmilter multiplexer program. 336 337* pmult configuration:: 338* pmult example:: 339* pmult invocation:: 340 341Pmult Configuration 342 343* pmult-conf:: Multiplexer Configuration. 344* pmult-macros:: Translating MeTA1 macros. 345* pmult-client:: Pmult Client Configuration. 346* pmult-debug:: Debugging Pmult. 347 348Upgrading 349 350* 870-880:: Upgrading from 8.7 to 8.8 351* 850-860:: Upgrading from 8.5 to 8.6 352* 820-830:: Upgrading from 8.2 to 8.3 (or 8.4) 353* 700-800:: Upgrading from 7.0 to 8.0 354* 600-700:: Upgrading from 6.0 to 7.0 355* 5x0-600:: Upgrading from 5.x to 6.0 356* 500-510:: Upgrading from 5.0 to 5.1 357* 440-500:: Upgrading from 4.4 to 5.0 358* 43x-440:: Upgrading from 4.3.x to 4.4 359* 420-43x:: Upgrading from 4.2 to 4.3.x 360* 410-420:: Upgrading from 4.1 to 4.2 361* 400-410:: Upgrading from 4.0 to 4.1 362* 31x-400:: Upgrading from 3.1.x to 4.0 363* 30x-31x:: Upgrading from 3.0.x to 3.1 364* 2x-30x:: Upgrading from 2.x to 3.0.x 365* 1x-2x:: Upgrading from 1.x to 2.x 366 367 368 369File: mailfromd.info, Node: Preface, Next: Intro, Prev: Top, Up: Top 370 371Preface 372******* 373 374Simple Mail Transfer Protocol (SMTP) which is the standard for email 375transmissions across the Internet was designed in the good old days when 376nobody could even think of the possibility of e-mail being abused to 377send tons of unsolicited messages of dubious contents. Therefore it 378lacks mechanisms that could have prevented this abuse ("spamming"), or 379at least could have made it difficult. Attempts to introduce such 380mechanisms (such as SMTP-AUTH extension 381(http://tools.ietf.org/html/rfc2554)) are being made, but they are not 382in wide use yet and, probably, their introduction will not be enough to 383stop the e-mail abuse. Spamming is today's grim reality and developers 384spend lots of time and efforts designing new protection measures against 385it. 'Mailfromd' is one of such attempts. 386 387 The package is designed to work with any MTA supporting 'Milter' or 388'Pmilter' protocol, such as 'Sendmail', 'MeTA1' or 'Postfix'. It allows 389you to: 390 391 * Control whether messages come from trustworthy senders, using so 392 called "callout" or "Sender Address Verification" (*note SAV::) 393 mechanism. 394 395 * Prevent emails coming from forged addresses by use of SPF mechanism 396 (*note SPF Functions::). 397 398 * Limit connection and/or sending rates (*note Rate Limit::). 399 400 * Use "black-", "white-" and "greylisting" techniques. 401 402 * Invoke external programs or other mail filters. 403 404* Menu: 405 406* History:: Short 'mailfromd' history. 407* Acknowledgments:: Acknowledgments. 408 409 410File: mailfromd.info, Node: History, Next: Acknowledgments, Up: Preface 411 412Short history of 'mailfromd'. 413============================= 414 415The idea of the utility appeared in 2005, and its first version appeared 416soon afterward. Back then it was a simple implementation of Sender 417Address Verification (*note SAV::) for 'Sendmail' (hence its name - 418'mailfromd') with rudimentary tuning possibilities. 419 420 After a short run on my mail servers, I discovered that the utility 421was not flexible enough. It took less than a month to implement a 422configuration file that allowed the user to control program and data 423flow during the 'envfrom' SMTP state. The new version, 1.0, appeared in 424June, 2005. 425 426 Next major release, 1.2 (1.1 contained mostly bugfixes), appeared two 427months later, and introduced "mail sending rate" control (*note Rate 428Limit::). 429 430 The program evolved during the next year, and the version 2.0 was 431released in September, 2006. This version was a major change in the 432main idea of the program. Configuration file become a flexible filter 433script allowing the operator to control almost all SMTP states. The 434program supplied in the script file was compiled into a pseudo-code at 435startup, this code being subsequently evaluated each time the filter was 436invoked. This caused a considerable speed-up in comparison with the 437previous versions, where the run-time evaluator was traversing the parse 438tree. This version also introduced (implicitly, at the time), two 439separate data types for the entities declared in the script, which also 440played its role in the speed improvement (in the previous versions all 441data were considered strings). Lots of improvements were made in the 442filter language (MFL, *note MFL::) itself, such as user-defined 443functions, the 'switch' statement, the 'catch' statement for handling 444run-time errors, etc. The set of built-in functions extended 445considerably. A testsuite (using DejaGNU) was introduced in this 446version. 447 448 During this initial development period the limitations imposed by 449'libmilter' implementation became obvious. Finally, I felt they were 450stopping further development, and decided that 'mailfromd' should use 451its own 'Milter' implementation. This new library, 'libgacopyz' was the 452main new feature of the 3.0 release, which was released in November, 4532006. Another major feature was the '--dump-macros' option and 'macros' 454to 'rc.mailfromd' script, that were intended to facilitate the 455configuration on 'Sendmail' side. 456 457 The development of 3.x (more properly, 3.1.x) series concentrated 458mainly on bug-fixes, while the main development was done on the next 459branch. 460 461 The version 4.0 appeared on May 12, 2007. A full list of changes in 462this release is more than 500 lines long, so it is impractical to list 463them here. In particular, this version introduced lots of new features 464in MFL syntax and the library of useful MFL functions. The runtime 465engine was also improved, in particular, stack space become expandable 466which eliminated many run-time errors. This version also provided a 467foundation for MFL module system. The code generation was 468re-implemented to facilitate introduction of object files in future 469versions. Another new features in this release include SPF support and 470'mtasim' utility -- an MTA simulator designed for testing 'mailfromd' 471scripts (*note mtasim::). The test suite in this version was made 472portable by rewriting it in Autotest. 473 474 Another big leap forward was the 5.0 release, which appeared on 475December 26, 2008. It largely enriched a set of available functions (61 476new functions were introduced, which amounts to 41% of all the available 477functions in 5.0 release) and introduced several improvements in the MFL 478itself. Among others, function aliases and optional arguments in 479user-defined functions were introduced in this release. The new "run 480operation mode" allowed to execute arbitrary MFL functions from the 481command line. This release also raised the Mailutils version 482requirements to at least 2.0. 483 484 Version 6.0, which was released in on 12 December, 2009, introduced a 485full-fledged modular system, akin to that of Python, and quite a few 486improvements to the language. such as explicit type casts, 487concatenation operator, static variables, etc. 488 489 Starting from version 7.0, the focus of further development of 490'mailfromd' has shifted. While previously it had been regarded as a 491mail-filtering server, since then it was developed as a system for 492extending MTA functionality in the broad sense, mail filtering being 493only one of features it provides. 494 495 Version 7.0 makes the MFL syntax more consistent and the language 496itself more powerful. For example, it is no longer necessary to use 497prefixes before variables to dereference them. The new 'try--catch' 498construct allows for elegant handling of exceptions and errors. 499User-defined exceptions provide a way for programming complex loops and 500recursions with non-local exits. 501 502 This version introduces a concept of dedicated callout server. This 503allows 'mailfromd' to defer verifications for a later time if the remote 504server does not response within a reasonably short period of time (*note 505SMTP Timeouts::). 506 507 Six years later the version 8.0 was released. This version was a 508major rewrite of the mailfromd codebase. It introduced a separate 509callout daemon that made it possible to separate the mailfromd server 510machine from machines performing callout checks. The MFL language was 511extended by a number of built-in functions. 512 513 Since version 8.3 (2017-11-02) 'mailfromd' uses 'adns'(1) for DNS 514queries. 515 516 The version 8.7 released in July, 2020 introduced DKIM support. 517 518 ---------- Footnotes ---------- 519 520 (1) <https://www.gnu.org/software/adns> 521 522 523File: mailfromd.info, Node: Acknowledgments, Prev: History, Up: Preface 524 525Acknowledgments 526=============== 527 528Many people need to be thanked for their assistance in developing and 529debugging 'mailfromd'. After S. C. Johnson, I can say that this program 530"owes much to a most stimulating collection of users, who have goaded me 531beyond my inclination, and frequently beyond my ability in their endless 532search for "one more feature". Their irritating unwillingness to learn 533how to do things my way has usually led to my doing things their way; 534most of the time, they have been right." 535 536 A real test for a program like 'mailfromd' cannot be done but in 537conditions of production environment. A decision to try it in these 538conditions is by no means an easy one, it requires courage and good 539faith in the intentions and abilities of the author. To begin with, I 540would like to thank my contributors for these virtues. 541 542 Jan Rafaj has intrepidly been using 'mailfromd' since its early 543releases and invested lots of efforts in improving the program and its 544documentation. He is the author of many of the MFL library functions, 545shipped with the package. Some of his ideas are still waiting in my 546implementation queue, while new ones are consistently arriving. 547 548 Peter Markeloff patiently tested every 'mailfromd' release and helped 549discover and fix many bugs. 550 551 Zeus Panchenko contributed many ideas and gave lots of helpful 552comments. He offered invaluable help in debugging and testing 553'mailfromd' on FreeBSD platform. 554 555 Sergey Afonin proposed many improvements and new ideas. He also 556invested a lot of his time in finding bugs and testing bugfixes. 557 558 John McEleney and Ben McKeegan contributed the token bucket filter 559implementation (*note TBF::). 560 561 Con Tassios helped to find and fix various bugs and contributed the 562new implementation of the 'greylist' function (*note greylisting 563types::). 564 565 The following people (in alphabetical order) provided bug reports and 566helpful comments for various versions of the program: Alan Dobkin, Brent 567Spencer, Jeff Ballard, Nacho González López, Phil Miller, Simon 568Christian, Thomas Lynch. 569 570 571File: mailfromd.info, Node: Intro, Next: Building, Prev: Preface, Up: Top 572 5731 Introduction to 'mailfromd' 574***************************** 575 576'Mailfromd' is a general-purpose mail filtering daemon and a suite of 577accompanying utilities for 'Sendmail'(1), 'MeTA1'(2), 'Postfix'(3) or 578any other MTA that supports 'Milter' (or 'Pmilter') protocol. It is 579able to filter both incoming and outgoing messages using a filter 580program, written in "mail filtering language" (MFL). The daemon 581interfaces with the MTA using 'Milter' protocol. 582 583 The name 'mailfromd' can be thought of as an abbreviation for '_Mail_ 584_F_iltering and _R_untime _M_odification' _D_aemon, with an 'o' for 585itself. Historically, it stemmed from the fact that the original 586implementation was a simple filter implementing the "sender address 587verification" technique. Since then the program has changed 588dramatically, and now it is actually a language translator and run-time 589evaluator providing a set of built-in and library functions for 590filtering electronic mail. 591 592 The first part of this manual is an overview, describing the features 593'mailfromd' offers in general. 594 595 The second part is a tutorial, which provides an introduction for 596those who have not used 'mailfromd' previously. It moves from topic to 597topic in a logical, progressive order, building on information already 598explained. It offers only the principal information needed to master 599basic practical usage of 'mailfromd', while omitting many subtleties. 600 601 The other parts are meant to be used as a reference for those who 602know 'mailfromd' well enough, but need to look up some notions from time 603to time. Each chapter presents everything that needs to be said about a 604specific topic. 605 606 The manual assumes that the reader has a good knowledge of the SMTP 607protocol and the mail transport system he uses ('Sendmail' , 'Postfix' 608or 'MeTA1'). 609 610* Menu: 611 612* Conventions:: Typographical conventions. 613* Overview:: Mailfromd at a first glance 614* SAV:: Principles of Sender Address Verification. 615* Rate Limit:: Controlling Mail Sending Rate. 616* SPF:: SPF, DKIM, and others. 617 618 ---------- Footnotes ---------- 619 620 (1) See <http://www.sendmail.org> 621 622 (2) See <http://www.meta1.org> 623 624 (3) See <http://www.postfix.org> 625 626 627File: mailfromd.info, Node: Conventions, Next: Overview, Up: Intro 628 6291.1 Typographical conventions 630============================= 631 632This manual is written using Texinfo, the GNU documentation formatting 633language. The same set of Texinfo source files is used to produce both 634the printed and online versions of the documentation. This section 635briefly documents the typographical conventions used in this manual. 636 637 Examples you would type at the command line are preceded by the 638common shell primary prompt, '$'. The command itself is printed 'in 639this font', and the output it produces 'in this font', for example: 640 641 $ mailfromd --version 642 mailfromd (mailfromd 8.10) 643 644 In the text, the command names are printed 'like this', command line 645options are displayed in 'this font'. Some notions are emphasized _like 646this_, and if a point needs to be made strongly, it is done *this way*. 647The first occurrence of a new term is usually its "definition" and 648appears in the same font as the previous occurrence of "definition" in 649this sentence. File names are indicated like this: '/path/to/ourfile'. 650 651 The variable names are represented LIKE THIS, keywords and fragments 652of program text are written in 'this font'. 653 654 655File: mailfromd.info, Node: Overview, Next: SAV, Prev: Conventions, Up: Intro 656 6571.2 Overview of Mailfromd 658========================= 659 660In contrast to the most existing milter filters, 'mailfromd' does not 661implement any default filtering policies. Instead, it depends entirely 662on a "filter script", supplied to it by the administrator. The script, 663written in a specialized and simple to use language, called MFL (*note 664MFL::), is supposed to run a set of tests and to decide whether the 665message should be accepted by the MTA or not. To perform the tests, the 666script can examine the values of 'Sendmail' macros, use an extensive set 667of built-in and library functions, and invoke user-defined functions. 668 669 670File: mailfromd.info, Node: SAV, Next: Rate Limit, Prev: Overview, Up: Intro 671 6721.3 Sender Address Verification. 673================================ 674 675"Sender address verification", or "callout", is one of the basic mail 676verification techniques, implemented by 'mailfromd'. It consists in 677probing each MX server for the given address, until one of them gives a 678definite (positive or negative) reply. Using this technique you can 679block a sender address if it is not deliverable, thereby cutting off a 680large amount of spam. It can also be useful to block mail for 681undeliverable recipients, for example on a mail relay host that does not 682have a list of all the valid recipient addresses. This prevents 683undeliverable junk mail from entering the queue, so that your MTA 684doesn't have to waste resources trying to send 'MAILER-DAEMON' messages 685back. 686 687 Let's illustrate how it works on an example: 688 689 Suppose that the user '<jsmith@somedomain.net>' is trying to send 690mail to one of your local users. The remote machine connects to your 691MTA and issues 'MAIL FROM: <jsmith@somedomain.net>' command. However, 692your MTA does not have to take its word for it, so it uses 'mailfromd' 693to verify the sender address validity. 'Mailfromd' strips the domain 694name from the address ('somedomain.net') and queries DNS about 'MX' 695records for that domain. Suppose, it receives the following list 696 69710 relay1.somedomain.net 69820 relay2.somedomain.net 699 700 It then connects to first MX server, using SMTP protocol, as if it 701were going to send a message to '<jsmith@somedomain.net>'. This is 702called sending a "probe message". If the server accepts the recipient 703address, the 'mailfromd' accepts the incoming mail. Otherwise, if the 704server rejects the address, the mail is rejected as well. If the MX 705server cannot be connected, 'mailfromd' selects next server from the 706list and continues this process until it finds the answer or the list of 707servers is exhausted. 708 709 The "probe message" is like a normal mail except that no data are 710ever being sent. The probe message transaction in our example might 711look as follows ('S:' meaning messages sent by remote MTA, 'C:' meaning 712those sent by 'mailfromd'): 713 714 C: HELO mydomain.net 715 S: 220 OK, nice to meet you 716 C: MAIL FROM: <> 717 S: 220 <>: Sender OK 718 C: RCPT TO: <jsmith@somedomain.net> 719 S: 220 <jsmith@remote.net>: Recipient OK 720 C: QUIT 721 722 Probe messages are never delivered, deferred or bounced; they are 723always discarded. 724 725 The described method of address verification is called a "standard" 726method throughout this document. 'Mailfromd' also implements a method 727we call "strict". When using strict method, 'mailfromd' first resolves 728IP address of sender machine to a fully qualified domain name. Then it 729obtains 'MX' records for this machine, and then proceeds with probing as 730described above. 731 732 So, the difference between the two methods is in the set of 'MX' 733records that are being probed: standard method queries 'MX's based on 734the sender email domain, strict method works with 'MX's for the sender 735IP address. 736 737 Strict method allows to cut off much larger amount of spam, although 738it does have many drawbacks. Returning to our example above, consider 739the following situation: '<jsmith@somedomain.net>' is a perfectly normal 740address, but it is being used by a spammer from some other domain, say 741'otherdomain.com'. The standard method is not able to cope with such 742cases, whereas the strict one is. 743 744 An alert reader will ask: what happens if 'mailfromd' is not able to 745get a definite answer from any of MX servers? Actually, it depends 746entirely on how you will instruct it to act in this case, but the 747general practice is to return temporary failure, which will urge the 748remote party to retry sending their message later. 749 750 After receiving a definite answer, 'mailfromd' will cache it in its 751database, so that next time your MTA receives a message from that 752address (or from the sender IP/email address pair, for strict method), 753it will not waste its time trying to reach MX servers again. The 754records remain in the cache database for a certain time, after which 755they are discarded. 756 757* Menu: 758 759* Limitations:: 760 761 762File: mailfromd.info, Node: Limitations, Up: SAV 763 7641.3.1 Limitations of Sender Address Verification 765------------------------------------------------ 766 767Before deciding whether and how to use sender address verification, you 768should be aware of its limitations. 769 770 Both standard and strict methods suffer from the following 771limitations: 772 773 * The sender verification methods will perform poorly on highly 774 loaded sites. The traffic and/or resource usage overhead may not 775 be feasible for you. However, you may experiment with various 776 'mailfromd' options to find an optimal configuration. 777 778 * Some sites may blacklist your MTA if it probes them too often. 779 'Mailfromd' eliminates this drawback by using a "cache database", 780 which keeps results of the recent callouts. 781 782 * When verifying the remote address, no attempt to actually deliver 783 the message is made. If MTA accepts the address, 'mailfromd' 784 assumes it is OK. However in reality, a mail for a remote address 785 can bounce _after_ the nearest MTA accepts the recipient address. 786 787 This drawback can often be avoided by combining sender address 788 verification with greylisting (*note Greylisting::). 789 790 * If the remote server rejects the address, no attempt is being made 791 to discern between various reasons for rejection (client rejected, 792 'HELO rejected', 'MAIL FROM' rejected, etc.) 793 794 * Some major sites such as 'yahoo.com' do not reject unknown 795 addresses in reply to the 'RCPT TO' command, but report a delivery 796 failure in response to end of 'DATA' after a message is 797 transferred. Of course, sender address verification does not work 798 with such sites. However, a combination of address verification 799 and greylisting (*note Greylisting::) may be a good choice in such 800 cases. 801 802 In addition, strict verification breaks forward mail delivery. This 803is obvious, since mail forwarding is based on delivering unmodified 804message to another location, so the sender address domain will most 805probably not be the same as that of the MTA doing the forwarding. 806 807 808File: mailfromd.info, Node: Rate Limit, Next: SPF, Prev: SAV, Up: Intro 809 8101.4 Controlling Mail Sending Rate. 811================================== 812 813"Mail Sending Rate" for a given identity is defined as the number of 814messages with this identity received within a predefined interval of 815time. 816 817 MFL offers a set of functions for limiting mail sending rate (*note 818Rate limiting functions::), and for controlling broader rate aspects, 819such as data transfer rates (*note TBF::). 820 821 822File: mailfromd.info, Node: SPF, Prev: Rate Limit, Up: Intro 823 8241.5 SPF, DKIM, and others 825========================= 826 827"Sender Policy Framework", or SPF for short, is an extension to SMTP 828protocol that allows to identify forged identities supplied with the 829'MAIL FROM' and 'HELO' commands. The framework is explained in detail 830in RFC 4408 (<http://tools.ietf.org/html/rfc4408>) and on the SPF 831Project Site (http://www.openspf.org/). 832 833 Mailfromd provides a set of functions for using SPF to control mail 834flow. These are described in *note SPF Functions::. 835 836 "DomainKeys Identified Mail" (DKIM) is an email authentication method 837designed to detect forged sender addresses in emails. Mailfromd 838supports both DKIM signing and verification. *Note DKIM::, for a 839detailed description of these features. 840 841 Mailfromd also provides support for several third-party 842spam-abatement programs, in particular 'SpamAssassin', 'ClamAV', and 843DSPAM. These are discussed in *note Interfaces to Third-Party 844Programs::. 845 846 847File: mailfromd.info, Node: Building, Next: Tutorial, Prev: Intro, Up: Top 848 8492 Building the Package 850********************** 851 852This chapter contains a detailed list of steps you need to undertake in 853order to configure and build the package. 854 855 1. Make sure you have the necessary software installed. 856 857 To build 'mailfromd' you will need to have following packages on 858 your machine: 859 860 A. GNU mailutils version 3.3 or newer. 861 862 GNU mailutils is a general-purpose library for handling 863 electronic mail. It is available from <http://mailutils.org>. 864 865 B. GNU adns library, version 1.5.1 or newer. 866 867 GNU adns is an advanced DNS client library. The recent 868 version can be downloaded from 869 <http://www.chiark.greenend.org.uk/~ian/adns/adns.tar.gz>. 870 Visit <http://www.gnu.org/software/adns>, for more 871 information. 872 873 C. A DBM library. 'Mailfromd' is able to link with any flavor of 874 DBM supported by GNU mailutils. As of version 8.10 it will 875 refuse to build without DBM. By default, 'configure' will try 876 to find the best implementation installed on your machine 877 (preference is given to Berkeley DB) and will use it. You 878 can, however, explicitly specify which implementation you want 879 to use. To do so, use the '--with-dbm' configure option. Its 880 argument specifies the "type" of database to use. It must be 881 one of the types supported by GNU mailutils. At the time of 882 this writing, these are: 883 884 bdb 885 Berkeley DB (versions 2 to 6). 886 gdbm 887 GNU DBM. 888 kc 889 Kyoto Cabinet 890 tc 891 Tokyo Cabinet 892 ndbm 893 NDBM 894 895 To check what database types are supported by your version of 896 mailutils, run the following command: 897 898 $ mailutils dbd gdbm kc tc ndbm 899 900 For backward compatibility, 'configure' accepts the following 901 two options: 902 903 '--with-gdbm' 904 Same as '--with-dbm=gdbm'. 905 '--with-berkeley-db' 906 Same as '--with-dbm=bdb'. 907 908 For 'Sendmail' users, it often makes sense to configure 909 'mailfromd' to use the same database flavor as 'sendmail'. 910 The following table will help you do that. The column 'DB 911 type' lists types of DBM databases supported by 'mailfromd'. 912 The column 'confMAPDEF' lists the value of 'confMAPDEF' 913 Sendmail configuration macro corresponding to that database 914 type. The column 'configure option' contains the 915 corresponding option to configure. 916 917 DB type confMAPDEF configure option 918 --------------------------------------------------------------------------- 919 NDBM '-NNDBM' '--with-dbm=ndbm' 920 Berkeley DB '-NNEWDB' '--with-dbm=bdb' 921 GDBM N/A '--with-dbm=gdbm' 922 923 2. Decide what user privileges will be used to run 'mailfromd' 924 925 After startup, the program drops root privileges. By default, it 926 switches to the privileges of user 'mail', group 'mail'. If there 927 is no such user on your system, or you wish to use another user 928 account for this purpose, override it using DEFAULT_USER 929 environment variable. For example for 'mailfromd' to run as user 930 'nobody', use 931 932 ./configure DEFAULT_USER=nobody 933 934 The user name can also be changed at run-time (*note --user::). 935 936 3. Decide where to install 'mailfromd' and where its filter script and 937 data files will be located. 938 939 As usual, the default value for the installation prefix is 940 '/usr/local'. If it does not suit you, specify another location 941 using '--prefix' option, e.g.: '--prefix=/usr'. 942 943 During installation phase, the build system will install several 944 files. These files are: 945 946 'PREFIX/sbin/mailfromd' 947 Main daemon. *Note mailfromd: Invocation. 948 949 'PREFIX/etc/mailfromd.mf' 950 Default main filter script file. It is installed only if it 951 is not already there. Thus, if you are upgrading to a newer 952 version of 'mailfromd', your old script file will be preserved 953 with all your changes. 954 955 *Note MFL::, for a description of the mail filtering language. 956 957 'PREFIX/share/mailfromd/8.10/*.mf' 958 MFL modules. *Note Modules::. 959 960 'PREFIX/info/mailfromd.info*' 961 Documentation files. 962 963 'PREFIX/bin/mtasim' 964 MTA simulator program for testing 'mailfromd' scripts. *Note 965 mtasim::. 966 967 'PREFIX/sbin/pmult' 968 Pmilter multiplexor for 'MeTA1'. *Note pmult::. It is build 969 only if 'MeTA1' version 'PreAlpha29.0' or newer is installed 970 on the system. You may disable it by using the 971 '--disable-pmilter' command line option. 972 973 When testing for 'MeTA1' presence, 'configure' assumes its 974 default location. If it is not found there, inform 975 'configure' about its actual location by using the following 976 option: 977 978 --enable-pmilter=PREFIX 979 980 where PREFIX stands for the 'MeTA1' installation prefix. 981 982 It is advisable to use the same settings for file name prefixes as 983 those you used when configuring 'mailutils'. In particular, try to 984 use the same '--sysconfdir', since it will facilitate configuring 985 the whole system. 986 987 Another important point is location of "local state directory", 988 i.e. a directory where 'mailfromd' keeps its data files (e.g. 989 communication socket, PID-file and database files). By default, 990 its full name is 'LOCALSTATEDIR/mailfromd'. You can change it by 991 setting 'DEFAULT_STATE_DIR' configuration variable. This value can 992 be changed at run-time using the 'state-directory' configuration 993 statement (*note state-directory: conf-base.). 994 995 4. Select default communication socket. This is the socket used to 996 communicate with MTA, in the usual 'Milter' port notation (*note 997 milter port specification::). If the socket name does not begin 998 with a protocol or directory separator, it is assumed to be a UNIX 999 socket, located in the local state directory. The default value is 1000 'mailfrom', which is equivalent to 1001 'unix:LOCALSTATEDIR/mailfromd/mailfrom'. 1002 1003 To alter this, use 'DEFAULT_SOCKET' environment variable, e.g.: 1004 1005 ./configure DEFAULT_SOCKET=inet:999@localhost 1006 1007 The communication socket can be changed at run time using '--port' 1008 command line option (*note --port::) or the 'listen' configuration 1009 statement (*note listen: conf-server.). 1010 1011 5. Select default expiration interval. "Expiration interval" defines 1012 the period of time during which a record in the 'mailfromd' 1013 database is considered valid. It is described in more detail in 1014 *note Databases::. The default value is 86400 seconds, i.e. 24 1015 hours. It is OK for most sites. If, however, you wish to change 1016 it, use DEFAULT_EXPIRE_INTERVAL environment variable. 1017 1018 The 'DEFAULT_EXPIRE_RATES_INTERVAL' variable sets default 1019 expiration time for mail rate database (*note Rate limiting 1020 functions::). 1021 1022 Expiration settings can be changed at run time using 'database' 1023 statement in the 'mailfromd' configuration file (*note 1024 conf-database::). 1025 1026 6. Select a 'syslog' implementation to use. 1027 1028 'Mailfromd' uses 'syslog' for diagnostics output. The default 1029 'syslog' implementation on most systems (most notably, on 1030 GNU/Linux) uses blocking 'AF_UNIX SOCK_DGRAM' sockets. As a 1031 result, when an application calls 'syslog()', and 'syslogd' is not 1032 responding and the socket buffers get full, the application will 1033 hang. 1034 1035 For 'mailfromd', as for any daemon, it is more important that it 1036 continue to run, than that it continue to log. For this purpose, 1037 'mailfromd' is shipped with a non-blocking 'syslog' implementation 1038 by Simon Kelley. This implementation, instead of blocking, buffers 1039 log lines in memory. When the buffer log overflows, some lines are 1040 lost, but the daemon continues to run. When lines are lost, this 1041 fact is logged with a message of the form: 1042 1043 async_syslog overflow: 5 log entries lost 1044 1045 To enable this implementation, configure the package with 1046 '--enable-syslog-async' option, e.g.: 1047 1048 ./configure --enable-syslog-async 1049 1050 Additionally, you can instruct 'mailfromd' to use asynchronous 1051 syslog by default. To do so, set 'DEFAULT_SYSLOG_ASYNC' to 1, as 1052 shown in example below: 1053 1054 ./configure --enable-syslog-async DEFAULT_SYSLOG_ASYNC=1 1055 1056 You will be able to override these defaults at run-time by using 1057 the '--logger' command line option (*note Logging and Debugging::). 1058 1059 7. Run 'configure' with all the desired options. 1060 1061 For example, the following command: 1062 1063 ./configure DEFAULT_SOCKET=inet:999@localhost --with-berkeley-db=3 1064 1065 will configure the package to use Berkeley DB database, version 2, 1066 and 'inet:999@localhost' as the default communication socket. 1067 1068 At the end of its run 'configure' will print a concise summary of 1069 its configuration settings. It looks like that (with the long 1070 lines being split for readability): 1071 1072 ******************************************************************* 1073 Mailfromd configured with the following settings: 1074 1075 External preprocessor..................... /usr/bin/m4 -s 1076 DBM version............................... Berkeley DB v. 3 1077 Default user.............................. mail 1078 State directory........................... 1079 $(localstatedir)/$(PACKAGE) 1080 Socket.................................... mailfrom 1081 Expiration interval....................... 86400 1082 Negative DNS answer expiration interval... 3600 1083 Rates expire interval..................... 300 1084 Default syslog implementation............. blocking 1085 Readline (for mtasim)..................... yes 1086 Documentation rendition type.............. PROOF 1087 Enable pmilter support.................... no 1088 Enable GeoIP support...................... no 1089 ******************************************************************* 1090 1091 Make sure these settings satisfy your needs. If they do not, 1092 reconfigure the package with the right options. 1093 1094 8. Run 'make'. 1095 1096 9. Run 'make' install. 1097 1098 10. Make sure 'LOCALSTATEDIR/mailfromd' has the right owner and mode. 1099 1100 11. Examine filter script file ('SYSCONFDIR/mailfromd.mf') and edit 1101 it, if necessary. 1102 1103 12. If you are upgrading from an earlier release of Mailfromd, refer 1104 to *note Upgrading::, for detailed instructions. 1105 1106 1107File: mailfromd.info, Node: Tutorial, Next: MFL, Prev: Building, Up: Top 1108 11093 Tutorial 1110********** 1111 1112This chapter contains a tutorial introduction, guiding you through 1113various 'mailfromd' configurations, starting from the simplest ones and 1114proceeding up to more advanced forms. It omits most complicated 1115details, concentrating mainly on the common practical tasks. 1116 1117 If you are familiar to 'mailfromd', you can skip this chapter and go 1118directly to the next one (*note MFL::), which contains detailed 1119discussion of the mail filtering language and 'mailfromd' interaction 1120with the Mail Transport Agent. 1121 1122* Menu: 1123 1124* Start Up:: 1125* Simplest Configurations:: 1126* Conditional Execution:: 1127* Functions and Modules:: 1128* Domain Name System:: 1129* Checking Sender Address:: 1130* SMTP Timeouts:: 1131* Avoiding Verification Loops:: 1132* HELO Domain:: 1133* rset:: 1134* Controlling Number of Recipients:: 1135* Sending Rate:: 1136* Greylisting:: 1137* Local Account Verification:: 1138* Databases:: 1139* Testing Filter Scripts:: 1140* Run Mode:: 1141* Logging and Debugging:: 1142* Runtime errors:: 1143* Notes:: 1144 1145 1146File: mailfromd.info, Node: Start Up, Next: Simplest Configurations, Up: Tutorial 1147 11483.1 Start Up 1149============ 1150 1151The 'mailfromd' utility runs as a standalone "daemon" program and 1152listens on a predefined communication channel for requests from the 1153"Mail Transfer Agent" (MTA, for short). When processing each message, 1154the MTA installs communication with 'mailfromd', and goes through 1155several states, collecting the necessary data from the sender. At each 1156state it sends the relevant information to 'mailfromd', and waits for it 1157to reply. The 'mailfromd' filter receives the message data through 1158"Sendmail macros" and runs a "handler program" defined for the given 1159state. The result of this run is a "response code", that it returns to 1160the MTA. The following response codes are defined: 1161 1162'continue' 1163 Continue message processing. 1164 1165'accept' 1166 Accept this message for delivery. After receiving this code the 1167 MTA continues processing this message without further consulting 1168 'mailfromd' filter. 1169 1170'reject' 1171 Reject this message. The message processing stops at this stage, 1172 and the sender receives the reject reply ('5XX' reply code). No 1173 further 'mailfromd' handlers are called for this message. 1174 1175'discard' 1176 Silently discard the message. This means that MTA will continue 1177 processing this message as if it were going to deliver it, but will 1178 discard it after receiving. No further interaction with 1179 'mailfromd' occurs. 1180 1181'tempfail' 1182 Temporarily reject the message. The message processing stops at 1183 this stage, and the sender receives the 'temporary failure' reply 1184 ('4XX' reply code). No further 'mailfromd' handlers are called for 1185 this message. 1186 1187 The instructions on how to process the message are supplied to 1188'mailfromd' in its "filter script file". It is normally called 1189'/usr/local/etc/mailfromd.mf' (but can be located elsewhere, *note 1190Invocation::) and contains a set of "milter state handlers", or 1191subroutines to be executed in various SMTP states. Each interaction 1192state can be supplied its own handling procedure. A missing procedure 1193implies 'continue' response code. 1194 1195 The filter script can define up to nine "milter state handlers", 1196called after the names of milter states: 'connect', 'helo', 'envfrom', 1197'envrcpt', 'data', 'header', 'eoh', 'body', and 'eom'. The 'data' 1198handler is invoked only if MTA uses Milter protocol version 3 or later. 1199Two special handlers are available for initialization and clean-up 1200purposes: 'begin' is called before the processing starts, and 'end' is 1201called after it is finished. The diagram below shows the control flow 1202when processing an SMTP transaction. Lines marked with 'C:' show SMTP 1203commands issued by the remote machine (the "client"), those marked with 1204'=>' show called handlers with their arguments. An '[R]' appearing at 1205the start of a line indicates that this part of the transaction can be 1206repeated any number of times: 1207 1208 => begin() 1209 => connect(HOSTNAME, FAMILY, PORT, 'IP address') 1210 C: HELO DOMAIN 1211 helo(DOMAIN) 1212 for each message transaction 1213 do 1214 C: MAIL FROM SENDER 1215 => envfrom(SENDER) 1216 1217 [R] C: RCPT TO RECIPIENT 1218 => envrcpt(RECIPIENT) 1219 1220 C: DATA 1221 => data() 1222 [R] C: HEADER: VALUE 1223 => header(HEADER, VALUE) 1224 1225 C: 1226 => eoh() 1227 1228 [R] C: BODY-LINE 1229 => /* Collect lines into blocks BLK of 1230 => * at most LEN bytes and for each 1231 => * such block call: 1232 => */ 1233 => body(BLK, LEN) 1234 1235 C: . 1236 => eom() 1237 done 1238 => end() 1239 1240Figure 3.1: Mailfromd Control Flow 1241 1242 This control flow is maintained for as long as each called handler 1243returns 'continue' (*note Actions::). Otherwise, if any handler returns 1244'accept' or 'discard', the message processing continues, but no other 1245handler is called. In the case of 'accept', the MTA will accept the 1246message for delivery, in the case of 'discard' it will silently discard 1247it. 1248 1249 If any of the handlers returns 'reject' or 'tempfail', the result 1250depends on the handler. If this code is returned by 'envrcpt' handler, 1251it causes this particular recipient address to be rejected. When 1252returned by any other handler, it causes the whole message will be 1253rejected. 1254 1255 The 'reject' and 'tempfail' actions executed by 'helo' handler do not 1256take effect immediately. Instead, their action is deferred until the 1257next SMTP command from the client, which is usually 'MAIL FROM'. 1258 1259 1260File: mailfromd.info, Node: Simplest Configurations, Next: Conditional Execution, Prev: Start Up, Up: Tutorial 1261 12623.2 Simplest Configurations 1263=========================== 1264 1265The 'mailfromd' script file contains a series of "declarations" of the 1266handler procedures. Each declaration has the form: 1267 1268 prog NAME 1269 do 1270 ... 1271 done 1272 1273where 'prog', 'do' and 'done' are the "keywords", and NAME is the state 1274name for this handler. The dots in the above example represent the 1275actual "code", or a set of commands, instructing 'mailfromd' how to 1276process the message. 1277 1278 For example, the declaration: 1279 1280 prog envfrom 1281 do 1282 accept 1283 done 1284 1285installs a handler for 'envfrom' state, which always approves the 1286message for delivery, without any further interaction with 'mailfromd'. 1287 1288 The word 'accept' in the above example is an "action". "Action" is a 1289special language statement that instructs the run-time engine to stop 1290execution of the program and to return a response code to the 1291'Sendmail'. There are five actions, one for each response code: 1292'continue', 'accept', 'reject', 'discard', and 'tempfail'. Among these, 1293'reject' and 'discard' can optionally take one to three arguments. 1294There are two ways of supplying the arguments. 1295 1296 In the first form, called "literal" or "traditional" notation, the 1297arguments are supplied as additional words after the action name, 1298separated by whitespace. The first argument is a three-digit RFC 2821 1299reply code. It must begin with '5' for 'reject' and with '4' for 1300'tempfail'. If two arguments are supplied, the second argument must be 1301either an "extended reply code" (RFC 1893/2034) or a textual string to 1302be returned along with the SMTP reply. Finally, if all three arguments 1303are supplied, then the second one must be an extended reply code and the 1304third one must supply the textual string. The following examples 1305illustrate all possible ways of using the 'reject' statement in literal 1306notation: 1307 1308 reject 1309 reject 503 1310 reject 503 5.0.0 1311 reject 503 "Need HELO command" 1312 reject 503 5.0.0 "Need HELO command" 1313 1314Please note the quotes around the textual string. 1315 1316 Another form for these action is called "functional" notation, 1317because it resembles the function syntax. When used in this form, the 1318action word is followed by a parenthesized group of exactly three 1319arguments, separated by commas. The meaning and ordering of the 1320argument is the same as in literal form. Any of three arguments may be 1321absent, in which case it will be replaced by the default value. To 1322illustrate this, here are the statements from the previous example, 1323written in functional notation: 1324 1325 reject(,,) 1326 reject(503,,) 1327 reject(503, 5.0.0) 1328 reject(503,, "Need HELO command") 1329 reject(503, 5.0.0, "Need HELO command") 1330 1331 1332File: mailfromd.info, Node: Conditional Execution, Next: Functions and Modules, Prev: Simplest Configurations, Up: Tutorial 1333 13343.3 Conditional Execution 1335========================= 1336 1337Programs consisting of a single action are rarely useful. In most cases 1338you will want to do some checking and decide whether to process the 1339message depending on its result. For example, if you do not want to 1340accept messages from the address '<badguy@some.net>', you could write 1341the following program: 1342 1343 prog envfrom 1344 do 1345 if $f = "badguy@some.net" 1346 reject 1347 else 1348 accept 1349 fi 1350 done 1351 1352 This example illustrates several important concepts. First or all, 1353'$f' in the third line is a "Sendmail macro reference". Sendmail macros 1354are referenced the same way as in 'sendmail.cf', with the only 1355difference that curly braces around macro names are optional, even if 1356the name consists of several letters. The value of a macro reference is 1357always a string. 1358 1359 The equality operator ('=') compares its left and right arguments and 1360evaluates to true if the two strings are exactly the same, or to false 1361otherwise. Apart from equality, you can use the regular relational 1362operators: '!=', '>', '>=', '<' and '<='. Notice that string comparison 1363in 'mailfromd' is always case sensitive. To do case-insensitive 1364comparison, translate both operands to upper or lower case (*Note 1365tolower::, and *note toupper::). 1366 1367 The 'if' statement decides what actions to execute depending on the 1368value its condition evaluates to. Its usual form is: 1369 1370 if EXPRESSION THEN-BODY [else ELSE-BODY] fi 1371 1372 The THEN-BODY is executed if the EXPRESSION evaluates to 'true' (i.e. 1373to any non-zero value). The optional ELSE-BODY is executed if the 1374EXPRESSION yields 'false' (i.e. zero). Both THEN-BODY and ELSE-BODY 1375can contain other 'if' statements, their nesting depth is not limited. 1376To facilitate writing complex conditional statements, the 'elif' keyword 1377can be used to introduce alternative conditions, for example: 1378 1379 prog envfrom 1380 do 1381 if $f = "badguy@some.net" 1382 reject 1383 elif $f = "other@domain.com" 1384 tempfail 470 "Please try again later" 1385 else 1386 accept 1387 fi 1388 done 1389 1390 *Note switch::, for more elaborate forms of conditional branching. 1391 1392 1393File: mailfromd.info, Node: Functions and Modules, Next: Domain Name System, Prev: Conditional Execution, Up: Tutorial 1394 13953.4 Functions and Modules 1396========================= 1397 1398As any programming language, MFL supports a concept of "function", i.e. 1399a body of code that is assigned a unique name and can be invoked 1400elsewhere as many times as needed. 1401 1402 All functions have a "definition" that introduces types and names of 1403the formal parameters and the result type, if the function is to return 1404a meaningful value (function definitions in MFL are discussed in detail 1405in *note User-Defined Functions: User-defined.). 1406 1407 A function is invoked using a special construct, a "function call": 1408 1409 NAME (ARG-LIST) 1410 1411where NAME is the function name, and ARG-LIST is a comma-separated list 1412of expressions. Each expression in ARG-LIST is evaluated, and its type 1413is compared with that of the corresponding formal argument. If the 1414types differ, the expression is converted to the formal argument type. 1415Finally, a copy of its value is passed to the function as a 1416corresponding argument. The order in which the expressions are 1417evaluated is not defined. The compiler checks that the number of 1418elements in ARG-LIST match the number of mandatory arguments for 1419function NAME. 1420 1421 If the function does not deliver a result, it should only be called 1422as a statement. 1423 1424 Functions may be recursive, even mutually recursive. 1425 1426 'Mailfromd' comes with a rich set of predefined functions for various 1427purposes. There are two basic function classes: "built-in" functions, 1428that are implemented by the MFL runtime environment in 'mailfromd', and 1429"library" functions, that are implemented in MFL. The built-in 1430functions are always available and no preparatory work is needed before 1431calling them. In contrast, the library functions are defined in 1432"modules", special MFL source files that contain functions designed for 1433a particular task. In order to access a library function, you must 1434first "require" a module it is defined in. This is done using 'require' 1435statement. For example, the function 'hostname' looks up in the DNS the 1436name corresponding to the IP address specified as its argument. This 1437function is defined in module 'dns.mf', so before calling it you must 1438require this module: 1439 1440 require dns 1441 1442The 'require' statement takes a single argument: the name of the 1443requested module (without the '.mf' suffix). It looks up the module on 1444disk and loads it if it is available. 1445 1446 For more information about the module system *Note Modules::. 1447 1448 1449File: mailfromd.info, Node: Domain Name System, Next: Checking Sender Address, Prev: Functions and Modules, Up: Tutorial 1450 14513.5 Domain Name System 1452====================== 1453 1454Site administrators often do not wish to accept mail from hosts that do 1455not have a proper reverse delegation in the Domain Name System. In the 1456previous section we introduced the library function 'hostname', that 1457looks up in the DNS the name corresponding to the IP address specified 1458as its argument. If there is no corresponding name, the function 1459returns its argument unchanged. This can be used to test if the IP was 1460resolved, as illustrated in the example below: 1461 1462 require 'dns' 1463 1464 prog envfrom 1465 do 1466 if hostname($client_addr) = $client_addr 1467 reject 1468 fi 1469 done 1470 1471 The '#require dns' statement loads the module 'dns.mf', after which 1472the definition of 'hostname' becomes available. 1473 1474 A similar function, 'resolve', which resolves the symbolic name to 1475the corresponding IP address is provided in the same 'dns.mf' module. 1476 1477 1478File: mailfromd.info, Node: Checking Sender Address, Next: SMTP Timeouts, Prev: Domain Name System, Up: Tutorial 1479 14803.6 Checking Sender Address 1481=========================== 1482 1483A special language construct is provided for verification of sender 1484addresses ("callout"): 1485 1486 on poll $f do 1487 when success: 1488 accept 1489 when not_found or failure: 1490 reject 550 5.1.0 "Sender validity not confirmed" 1491 when temp_failure: 1492 tempfail 450 4.1.0 "Try again later" 1493 done 1494 1495 The 'on poll' construct runs standard verification (*note standard 1496verification::) for the email address specified as its argument (in the 1497example above it is the value of the Sendmail macro '$f'). The check 1498can result in the following conditions: 1499 1500'success' 1501 The address exists. 1502 1503'not_found' 1504 The address does not exist. 1505 1506'failure' 1507 Some error of permanent nature occurred during the check. The 1508 existence of the address cannot be verified. 1509 1510'temp_failure' 1511 Some temporary failure occurred during the check. The existence of 1512 the address cannot be verified at the moment. 1513 1514 The 'when' branches of the 'on poll' statement introduce statements, 1515that are executed depending on the actual return condition. If any 1516condition occurs that is not handled within the 'on' block, the run-time 1517evaluator will signal an "exception"(1) and return temporary failure, 1518therefore it is advisable to always handle all four conditions. In 1519fact, the condition handling shown in the above example is preferable 1520for most normal configurations: the mail is accepted if the sender 1521address is proved to exist and rejected otherwise. If a temporary 1522failure occurs, the remote party is urged to retry the transaction some 1523time later. 1524 1525 The 'poll' statement itself has a number of options that control the 1526type of the verification. These are discussed in detail in *note 1527poll::. 1528 1529 It is worth noticing that there is one special email address which is 1530always available on any host, it is the "null address" '<>' used in 1531error reporting. It is of no use verifying its existence: 1532 1533 prog envfrom 1534 do 1535 if $f == "" 1536 accept 1537 else 1538 on poll $f do 1539 when success: 1540 accept 1541 when not_found or failure: 1542 reject 550 5.1.0 "Sender validity not confirmed" 1543 when temp_failure: 1544 tempfail 450 4.1.0 "Try again later" 1545 done 1546 fi 1547 done 1548 1549 ---------- Footnotes ---------- 1550 1551 (1) For more information about exceptions and their handling, please 1552refer to *note Exceptions::. 1553 1554 1555File: mailfromd.info, Node: SMTP Timeouts, Next: Avoiding Verification Loops, Prev: Checking Sender Address, Up: Tutorial 1556 15573.7 SMTP Timeouts 1558================= 1559 1560When using polling functions, it is important to take into account 1561possible delays, which can occur in SMTP transactions. Such delays may 1562be due to low network bandwidth or high load on the remote server. Some 1563sites impose them willingly, as a spam-fighting measure. 1564 1565 Ideally the callout verification should use the timeout values 1566defined in the RFC 2822, but this is impossible in practice, because it 1567would cause a "timeout escalation", which consists in propagating delays 1568encountered in a callout SMTP session back to the remote client whose 1569session initiated the callout. 1570 1571 Consider, for example, the following scenario. An MFL script 1572performs a callout on 'envfrom' stage. The remote server is overloaded 1573and delays heavily in responding, so that the initial response arrives 3 1574minutes after establishing the connection, and processing the 'EHLO' 1575command takes another 3 minutes. These delays are OK according to the 1576RFC, which imposes a 5 minute limit for each stage, but while waiting 1577for the remote reply our SMTP server remains in the 'envfrom' state with 1578the client waiting for a response to its 'MAIL' command more than 6 1579minutes, which is intolerable, because of the same 5 minute limit. 1580Thus, the client will almost certainly break the session. 1581 1582 To avoid this, 'mailfromd' uses a special instance, called "callout 1583server", which is responsible for running callout SMTP sessions 1584asynchronously. The usual sender verification is performed using 1585so-called "soft" timeout values, which are set to values short enough to 1586not disturb the incoming session (e.g. a timeout for 'HELO' response is 15873 seconds, instead of 5 minutes). If this verification yields a 1588definite answer, that answer is stored in the cache database and 1589returned to the calling procedure immediately. If, however, the 1590verification is aborted due to a timeout, the caller procedure is 1591returned an 'e_temp_failure' exception, and the callout is scheduled for 1592processing by a callout server. This exception normally causes the 1593milter session to return a temporary error to the sender, urging it to 1594retry the connection later. 1595 1596 In the meantime, the callout server runs the sender verification 1597again using another set of timeouts, called "hard" timeouts, which are 1598normally much longer than 'soft' ones (they default to the values 1599required by RFC 2822). If it gets a definitive result (e.g. 'email 1600found' or 'email not found'), the server stores it in the cache 1601database. If the callout ends due to a timeout, a 'not_found' result is 1602stored in the database. 1603 1604 Some time later, the remote server retries the delivery, and the 1605'mailfromd' script is run again. This time, the callout function will 1606immediately obtain the already cached result from the database and 1607proceed accordingly. If the callout server has not finished the request 1608by the time the sender retries the connection, the latter is again 1609returned a temporary error, and the process continues until the callout 1610is finished. 1611 1612 Usually, callout server is just another instance of 'mailfromd' 1613itself, which is started automatically to perform scheduled SMTP 1614callouts. It is also possible to set up a separate callout server on 1615another machine. This is discussed in *note calloutd::. 1616 1617 For a detailed information about callout timeouts and their 1618configuration, see *note conf-timeout::. 1619 1620 For a description of how to configure 'mailfromd' to use callout 1621servers, see *note conf-server::. 1622 1623 1624File: mailfromd.info, Node: Avoiding Verification Loops, Next: HELO Domain, Prev: SMTP Timeouts, Up: Tutorial 1625 16263.8 Avoiding Verification Loops 1627=============================== 1628 1629An 'envfrom' program consisting only of the 'on poll' statement will 1630work smoothly for incoming mails, but will create infinite loops for 1631outgoing mails. This is because upon sending an outgoing message 1632'mailfromd' will start the verification procedure, which will initiate 1633an SMTP transaction with the same mail server that runs it. This 1634transaction will in turn trigger execution of 'on poll' statement, etc. 1635ad infinitum. To avoid this, any properly written filter script should 1636not run the verification procedure on the email addresses in those 1637domains that are relayed by the server it runs on. This can be achieved 1638using 'relayed' function. The function returns 'true' if its argument 1639is contained in one of the predefined "domain list" files. These files 1640correspond to 'Sendmail' plain text files used in 'F' class definition 1641forms (see 'Sendmail Installation and Operation Guide', chapter 5.3), 1642i.e. they contain one domain name per line, with empty lines and lines 1643started with '#' being ignored. The domain files consulted by 'relayed' 1644function are defined in the 'relayed-domain-file' configuration file 1645statement (*note relayed-domain-file: conf-base.): 1646 1647 relayed-domain-file (/etc/mail/local-host-names, 1648 /etc/mail/relay-domains); 1649 1650or: 1651 1652 relayed-domain-file /etc/mail/local-host-names; 1653 relayed-domain-file /etc/mail/relay-domains; 1654 1655 The above example declares two domain list files, most commonly used 1656in 'Sendmail' installations to keep hostnames of the server (1) and 1657names of the domains, relayed by this server(2). 1658 1659 Given all this, we can improve our filter program: 1660 1661 require 'dns' 1662 1663 prog envfrom 1664 do 1665 if $f == "" 1666 accept 1667 elif relayed(hostname(${client_addr})) 1668 accept 1669 else 1670 on poll $f do 1671 when success: 1672 accept 1673 when not_found or failure: 1674 reject 550 5.1.0 "Sender validity not confirmed" 1675 when temp_failure: 1676 tempfail 450 4.1.0 "Try again later" 1677 done 1678 fi 1679 done 1680 1681 If you feel that your Sendmail's relayed domains are not restrictive 1682enough for 'mailfromd' filters (for example you are relaying mails from 1683some third-party servers), you can use a database of trusted mail server 1684addresses. If the number of such servers is small enough, a single 'or' 1685statement can be used, e.g.: 1686 1687 elif ${client_addr} = "10.10.10.1" 1688 or ${client_addr} = "192.168.11.7" 1689 accept 1690 ... 1691 1692otherwise, if the servers' IP addresses fall within one or several 1693CIDRs, you can use the 'match_cidr' function (*note Internet address 1694manipulation functions::), e.g.: 1695 1696 elif match_cidr (${client_addr}, "199.232.0.0/16") 1697 accept 1698 ... 1699 1700or combine both methods. Finally, you can keep a DBM database of 1701relayed addresses and use 'dbmap' or 'dbget' function for checking 1702(*note Database functions::). 1703 1704 elif dbmap("%__statedir__/relay.db", ${client_addr}) 1705 accept 1706 ... 1707 1708 ---------- Footnotes ---------- 1709 1710 (1) class 'w', see 'Sendmail Installation and Operation Guide', 1711chapter 5.2. 1712 1713 (2) class 'R' 1714 1715 1716File: mailfromd.info, Node: HELO Domain, Next: rset, Prev: Avoiding Verification Loops, Up: Tutorial 1717 17183.9 HELO Domain 1719=============== 1720 1721Some of the mail filtering conditions may depend on the value of "helo 1722domain" name, i.e. the argument to the SMTP 'EHLO' (or 'HELO') command. 1723If you ever need such conditions, take into account the following 1724caveats. Firstly, although 'Sendmail' passes the helo domain in '$s' 1725macro, it does not do this consistently. In fact, the '$s' macro is 1726available only to the 'helo' handler, all other handlers won't see it, 1727no matter what the value of the corresponding 'Milter.macros.HANDLER' 1728statement. So, if you wish to access its value from any handler, other 1729than 'helo', you will have to store it in a "variable" in the 'helo' 1730handler and then use this variable value in the other handler. This 1731approach is also recommended for another MTAs. This brings us to the 1732concept of variables in 'mailfromd' scripts. 1733 1734 A variable is declared using the following syntax: 1735 1736 TYPE NAME 1737 1738where VARIABLE is the variable name and TYPE is 'string', if the 1739variable is to hold a string value, and 'number', if it is supposed to 1740have a numeric value. 1741 1742 A variable is assigned a value using the 'set' statement: 1743 1744 set NAME EXPR 1745 1746where EXPR is any valid MFL expression. 1747 1748 The 'set' statement can occur within handler or function declarations 1749as well as outside of them. 1750 1751 There are two kinds of 'Mailfromd' variables: "global variables", 1752that are visible to all handlers and functions, and "automatic 1753variables", that are available only within the handler or function where 1754they are declared. For our purpose we need a global variable (*Note 1755Variable classes: Variables, for detailed descriptions of both kinds of 1756variables). 1757 1758 The following example illustrates an approach that allows to use the 1759'HELO' domain name in any handler: 1760 1761 # Declare the helohost variable 1762 string helohost 1763 1764 prog helo 1765 do 1766 # Save the host name for further use 1767 set helohost $s 1768 done 1769 1770 prog envfrom 1771 do 1772 # Reject hosts claiming to be localhost 1773 if helohost = "localhost" 1774 reject 570 "Please specify real host name" 1775 fi 1776 done 1777 1778 Notice, that for this approach to work, your MTA must export the 's' 1779macro (e.g., in case of Sendmail, the 'Milter.macros.helo' statement in 1780the 'sendmail.cf' file must contain 's'. *note Sendmail::). This 1781requirement can be removed by using the "handler argument" of 'helo'. 1782Each 'mailfromd' handler is given one or several arguments. The exact 1783number of arguments and their meaning are handler-specific and are 1784described in *note Handlers::, and *note Figure 3.1: 1785milter-control-flow. The arguments are referenced by their ordinal 1786number, using the notation '$N'. The 'helo' handler takes one argument, 1787whose value is the helo domain. Using this information, the 'helo' 1788handler from the example above can be rewritten as follows: 1789 1790 prog helo 1791 do 1792 # Save the host name for further use 1793 set helohost $1 1794 done 1795 1796 1797File: mailfromd.info, Node: rset, Next: Controlling Number of Recipients, Prev: HELO Domain, Up: Tutorial 1798 17993.10 SMTP RSET and Milter Abort Handling 1800======================================== 1801 1802In previous section we have used a global variable to hold certain 1803information and share it between handlers. In the majority of cases, 1804such information is session specific, and becomes invalid if the remote 1805party issues the SMTP 'RSET' command. Therefore, 'mailfromd' clears all 1806global variables when it receives a Milter 'abort' request, which is 1807normally generated by this command. 1808 1809 However, you may need some variables that retain their values even 1810across SMTP session resets. In 'mailfromd' terminology such variables 1811are called "precious". Precious variables are declared by prefixing 1812their declaration with the keyword 'precious'. Consider, for example, 1813this snippet of code: 1814 1815 precious number rcpt_counter 1816 1817 prog envrcpt 1818 do 1819 set rcpt_counter rcpt_counter + 1 1820 done 1821 1822 Here, the variable 'rcpt_counter' is declared as precious and its 1823value is incremented each time the 'envrcpt' handler is called. This 1824way, 'rcpt_counter' will keep the total number of SMTP 'RCPT' commands 1825issued during the session, no matter how many times it was restarted 1826using the 'RSET' command. 1827 1828 1829File: mailfromd.info, Node: Controlling Number of Recipients, Next: Sending Rate, Prev: rset, Up: Tutorial 1830 18313.11 Controlling Number of Recipients 1832===================================== 1833 1834Any MTA provides a way to limit the number of recipients per message. 1835For example, in 'Sendmail' you may use the 'MaxRecipientsPerMessage' 1836option(1). However, such methods are not flexible, so you are often 1837better off using 'mailfromd' for this purpose. 1838 1839 'Mailfromd' keeps the number of recipients collected so far in 1840variable 'rcpt_count', which can be controlled in 'envrcpt' handler as 1841shown in the example below: 1842 1843 prog envrcpt 1844 do 1845 if rcpt_count > 10 1846 reject 550 5.7.1 "Too many recipients" 1847 fi 1848 done 1849 1850 This filter will accept no more than 10 recipients per message. You 1851may achieve finer granularity by using additional conditions. For 1852example, the following code will allow any number of recipients if the 1853mail is coming from a domain relayed by the server, while limiting it to 185410 for incoming mail from other domains: 1855 1856 prog envrcpt 1857 do 1858 if not relayed(hostname($client_addr)) and rcpt_count > 10 1859 reject 550 5.7.1 "Too many recipients" 1860 fi 1861 done 1862 1863 There are three important features to notice in the above code. 1864First of all, it introduces two "boolean" operators: 'and', which 1865evaluates to 'true' only if both left-side and right-side expressions 1866are 'true', and 'not', which reverses the value of its argument. 1867 1868 Secondly, the scope of an operation is determined by its 1869"precedence", or "binding strength". 'Not' binds more tightly than 1870'and', so its scope is limited by the next expression between it and 1871'and'. Using parentheses to underline the operator scoping, the above 1872'if' condition can be rewritten as follows: 1873 1874 if (not (relayed(hostname($client_addr)))) and (%rcpt_count > 10) 1875 1876 Finally, it is important to notice that all boolean expressions are 1877computed using "shortcut evaluation". To understand what it is, let's 1878consider the following expression: 'X and Y'. Its value is 'true' only 1879if both X and Y are 'true'. Now suppose that we evaluate the expression 1880from left to right and we find that X is false. This means that no 1881matter what the value of Y is, the resulting expression will be 'false', 1882therefore there is no need to compute Y at all. So, the boolean 1883shortcut evaluation works as follows: 1884 1885'X and Y' 1886 If 'X => false', do not evaluate Y and return 'false'. 1887 1888'X or Y' 1889 If 'X => true', do not evaluate Y and return 'true'. 1890 1891 Thus, in the expression 'not relayed(hostname($client_addr)) and 1892rcpt_count > 10', the value of the 'rcpt_count' variable will be 1893compared with '10' only if the 'relayed' function yielded 'false'. 1894 1895 To further enhance our sample filter, you may wish to make the 1896'reject' output more informative, to let the sender know what the 1897recipient limit is. To do so, you can use the "concatenation operator" 1898'.' (a dot): 1899 1900 set max_rcpt 10 1901 prog envrcpt 1902 do 1903 if not relayed(hostname($client_addr)) and rcpt_count > 10 1904 reject 550 5.7.1 "Too many recipients, max=" . max_rcpt 1905 fi 1906 done 1907 1908 When evaluating the third argument to 'reject', 'mailfromd' will 1909first convert 'max_rcpt' to string and then concatenate both strings 1910together, producing string 'Too many recipients, max=10'. 1911 1912 ---------- Footnotes ---------- 1913 1914 (1) 'Sendmail (tm) Installation and Operation Guide', chapter 5.6, 'O 1915-- Set Option'. 1916 1917 1918File: mailfromd.info, Node: Sending Rate, Next: Greylisting, Prev: Controlling Number of Recipients, Up: Tutorial 1919 19203.12 Sending Rate 1921================= 1922 1923We have introduced the notion of mail sending rate in *note Rate 1924Limit::. 'Mailfromd' keeps the computed rates in the special 'rate' 1925database (*note Databases::). Each record in this database consists of 1926a 'key', for which the rate is computed, and the rate value, in form of 1927a double precision floating point number, representing average number of 1928messages per second sent by this 'key' within the last sampling 1929interval. In the simplest case, the sender email address can be used as 1930a 'key', however we recommend to use a conjunction EMAIL-SENDER_IP 1931instead, so the actual EMAIL owner won't be blocked by actions of some 1932spammer abusing his/her address. 1933 1934 Two functions are provided to control and update sending rates. The 1935'rateok' function takes three mandatory arguments: 1936 1937 bool rateok(string KEY, number INTERVAL, number THRESHOLD) 1938 1939 The KEY meaning is described above. The INTERVAL is the sampling 1940interval, or the number of seconds to which the actual sending rate 1941value is converted. Remember that it is stored internally as a floating 1942point number, and thus cannot be directly used in 'mailfromd' filters, 1943which operate only on integer numbers. To use the rate value, it is 1944first converted to messages per given interval, which is an integer 1945number. For example, the rate '0.138888' brought to 1-hour interval 1946gives '500' (messages per hour). 1947 1948 When the 'rateok' function is called, it recomputes rate record for 1949the given KEY. If the new rate value converted to messages per given 1950INTERVAL is less than THRESHOLD, the function updates the database and 1951returns 'True'. Otherwise it returns 'False' and does not update the 1952database. 1953 1954 This function must be "required" prior to use, by placing the 1955following statement somewhere at the beginning of your script: 1956 1957 require rateok 1958 1959 For example, the following code limits the mail sending rate for each 1960'email address'-'IP' combination to 180 per hour. If the actual rate 1961value exceeds this limit, the sender is returned a temporary failure 1962response: 1963 1964 require rateok 1965 1966 prog envfrom 1967 do 1968 if not rateok($f . "-" . ${client_addr}, 3600, 180) 1969 tempfail 450 4.7.0 "Mail sending rate exceeded. Try again later" 1970 fi 1971 done 1972 1973Notice argument concatenation, used to produce the key. 1974 1975 It is often inconvenient to specify intervals in seconds, therefore a 1976special 'interval' function is provided. It converts its argument, 1977which is a textual string representing time interval in English, to the 1978corresponding number of seconds. Using this function, the function 1979invocation would be: 1980 1981 rateok($f . "-" . ${client_addr}, interval("1 hour"), 180) 1982 1983 The 'interval' function is described in *note interval::, and time 1984intervals are discussed in *note time interval specification::. 1985 1986 The 'rateok' function begins computing the rate as soon as it has 1987collected enough data. By default, it needs at least four mails. Since 1988this may lead to a big number of false positives (i.e. overestimated 1989rates) at the beginning of sampling interval, there is a way to specify 1990a minimum number of samples 'rateok' must collect before starting to 1991actually compute rates. This number of samples is given as the optional 1992fourth argument to the function. For example, the following call will 1993always return 'True' for the first 10 mails, no matter what the actual 1994rate: 1995 1996 rateok($f . "-" . ${client_addr}, interval("1 hour"), 180, 10) 1997 1998 The 'tbf_rate' function allows to exercise more control over the mail 1999rates. This function implements a "token bucket filter" (TBF) 2000algorithm. 2001 2002 The token bucket controls when the data can be transmitted based on 2003the presence of abstract entities called "tokens" in a container called 2004"bucket". Each token represents some amount of data. The algorithm 2005works as follows: 2006 2007 * A token is added to the bucket at a constant rate of 1 token per T 2008 microseconds. 2009 * A bucket can hold at most M tokens. If a token arrives when the 2010 bucket is full, that token is discarded. 2011 * When N items of data arrive (e.g. N mails), N tokens are removed 2012 from the bucket and the data are accepted. 2013 * If fewer than N tokens are available, no tokens are removed from 2014 the bucket and the data are not accepted. 2015 2016 This algorithm allows to keep the data traffic at a constant rate T 2017with bursts of up to M data items. Such bursts occur when no data was 2018being arrived for M*T or more microseconds. 2019 2020 'Mailfromd' keeps buckets in a database 'tbf'. Each bucket is 2021identified by a unique "key". The 'tbf_rate' function is defined as 2022follows: 2023 2024 bool tbf_rate(string KEY, number N, number T, number M) 2025 2026 The KEY identifies the bucket to operate upon. The rest of arguments 2027is described above. The 'tbf_rate' function returns 'True' if the 2028algorithm allows to accept the data and 'False' otherwise. 2029 2030 Depending on how the actual arguments are selected the 'tbf_rate' 2031function can be used to control various types of flow rates. For 2032example, to control mail sending rate, assign the arguments as follows: 2033N to the number of mails and T to the control interval in microseconds: 2034 2035 prog envfrom 2036 do 2037 if not tbf_rate($f . "-" . $client_addr, 1, 10000000, 20) 2038 tempfail 450 4.7.0 "Mail sending rate exceeded. Try again later" 2039 fi 2040 done 2041 2042 The example above permits to send at most one mail each 10 seconds. 2043The burst size is set to 20. 2044 2045 Another use for the 'tbf_rate' function is to limit the total 2046delivered mail size per given interval of time. To do so, the function 2047must be used in 'prog eom' handler, because it is the only handler where 2048the entire size of the message is known. The N argument must contain 2049the number of bytes in the email (or email bytes * number of 2050recipients), and the T must be set to the number of bytes per 2051microsecond a given user is allowed to send. The M argument must be 2052large enough to accommodate a couple of large emails. E.g.: 2053 2054 prog eom 2055 do 2056 if not tbf_rate("$f-$client_addr", 2057 message_size(current_message()), 2058 10240*1000000, # At most 10 kb/sec 2059 10*1024*1024) 2060 tempfail 450 4.7.0 "Data sending rate exceeded. Try again later" 2061 fi 2062 done 2063 2064 *Note Rate limiting functions::, for more information about 'rateok' 2065and 'tbf_rate' functions. 2066 2067 2068File: mailfromd.info, Node: Greylisting, Next: Local Account Verification, Prev: Sending Rate, Up: Tutorial 2069 20703.13 Greylisting 2071================ 2072 2073Greylisting is a simple method of defending against the spam proposed by 2074Evan Harris. In few words, it consists in recording the 'sender 2075IP'-'sender email'-'recipient email' triplet of mail transactions. Each 2076time the unknown triplet is seen, the corresponding message is rejected 2077with the 'tempfail' code. If the mail is legitimate, this will make the 2078originating server retry the delivery later, until the destination 2079eventually accepts it. If, however, the mail is a spam, it will 2080probably never be retried, so the users will not be bothered by it. 2081Even if the spammer will retry the delivery, the "greylisting period" 2082will give spam-detection systems, such as DNSBLs, enough time to detect 2083and blacklist it, so by the time the destination host starts accepting 2084emails from this triplet, it will already be blocked by other means. 2085 2086 You will find the detailed description of the method in The Next Step 2087in the Spam Control War: Greylisting 2088(http://projects.puremagic.com/greylisting/whitepaper.html), the 2089original whitepaper by Evan Harris. 2090 2091 The 'mailfromd' implementation of greylisting is based on 'greylist' 2092function. The function takes two arguments: the 'key', identifying the 2093greylisting triplet, and the 'interval'. The function looks up the key 2094in the "greylisting database". If such a key is not found, a new entry 2095is created for it and the function returns 'true'. If the key is found, 2096'greylist' returns 'false', if it was inserted to the database more than 2097'interval' seconds ago, and 'true' otherwise. In other words, from the 2098point of view of the greylisting algorithm, the function returns 'true' 2099when the message delivery should be blocked. Thus, the simplest 2100implementation of the algorithm would be: 2101 2102 prog envrcpt 2103 do 2104 if greylist("${client_addr}-$f-${rcpt_addr}", interval("1 hour")) 2105 tempfail 451 4.7.1 "You are greylisted" 2106 fi 2107 done 2108 2109 However, the message returned by this example, is not informative 2110enough. In particular, it does not tell when the message will be 2111accepted. To help you produce more informative messages, 'greylist' 2112function stores the number of seconds left to the end of the greylisting 2113period in the global variable 'greylist_seconds_left', so the above 2114example could be enhanced as follows: 2115 2116 prog envrcpt 2117 do 2118 set gltime interval("1 hour") 2119 if greylist("${client_addr}-$f-${rcpt_addr}", gltime) 2120 if greylist_seconds_left = gltime 2121 tempfail 451 4.7.1 2122 "You are greylisted for %gltime seconds" 2123 else 2124 tempfail 451 4.7.1 2125 "Still greylisted for %greylist_seconds_left seconds" 2126 fi 2127 fi 2128 done 2129 2130 In real life you will have to avoid greylisting some messages, in 2131particular those coming from the '<>' address and from the IP addresses 2132in your relayed domain. It can easily be done using the techniques 2133described in previous sections and is left as an exercise to the reader. 2134 2135 'Mailfromd' provides two implementations of greylisting primitives, 2136which differ in the information stored in the database. The one 2137described above is called "traditional". It keeps in the database the 2138time when the greylisting was activated for the given key, so the 2139'greylisting' function uses its second argument ('interval') and the 2140current timestamp to decide whether the key is still greylisted. 2141 2142 The second implementation is called by the name of its inventor "Con 2143Tassios". This implementation stores in the database the time when the 2144greylisting period is set to expire, computed by the 'greylist' when it 2145is first called for the given key, using the formula 'current_timestamp 2146+ interval'. Subsequent calls to 'greylist' compare the current 2147timestamp with the one stored in the database and ignore their second 2148argument. This implementation is enabled by one of the following 2149pragmas: 2150 2151 #pragma greylist con-tassios 2152or 2153 #pragma greylist ct 2154 2155 When Con Tassios implementation is used, yet another function becomes 2156available. The function 'is_greylisted' (*note is_greylisted: 2157Greylisting functions.) returns 'True' if its argument is greylisted and 2158'False' otherwise. It can be used to check for the greylisting status 2159without actually updating the database: 2160 2161 if is_greylisted("${client_addr}-$f-${rcpt_addr}") 2162 ... 2163 fi 2164 2165 One special case is "whitelisting", which is often used together with 2166greylisting. To implement it, 'mailfromd' provides the function 2167'dbmap', which takes two mandatory arguments: 'dbmap(FILE, KEY)' (it 2168also allows an optional third argument, see *note dbmap::, for more 2169information on it). The first argument is the name of the DBM file 2170where to search for the key, the second one is the key to be searched. 2171Assuming you keep your whitelist database in file 2172'/var/run/whitelist.db', a more practical example will be: 2173 2174 prog envrcpt 2175 do 2176 set gltime interval("1 hour") 2177 2178 if not ($f = "" or relayed(hostname(${client_addr})) 2179 or dbmap("/var/run/whitelist.db", ${client_addr})) 2180 if greylist("${client_addr}-$f-${rcpt_addr}", gltime) 2181 if greylist_seconds_left = gltime 2182 tempfail 451 4.7.1 2183 "You are greylisted for %gltime seconds" 2184 else 2185 tempfail 451 4.7.1 2186 "Still greylisted for %greylist_seconds_left seconds" 2187 fi 2188 fi 2189 fi 2190 done 2191 2192 2193File: mailfromd.info, Node: Local Account Verification, Next: Databases, Prev: Greylisting, Up: Tutorial 2194 21953.14 Local Account Verification 2196=============================== 2197 2198In your filter script you may need to verify if the given user name is 2199served by your mail server, in other words, to verify if it represents a 2200"local account". Notice that in this context, the word "local" does not 2201necessarily mean that the account is local for the server running 2202'mailfromd', it simply means any account whose mailbox is served by the 2203mail servers using 'mailfromd'. 2204 2205 The 'validuser' function may be used for this purpose. It takes one 2206argument, the user name, and returns 'true' if this name corresponds to 2207a local account. To verify this, the function relies on 'libmuauth', a 2208powerful authentication library shipped with GNU 'mailutils'. More 2209precisely, it invokes a list of "authorization" functions. Each 2210function is responsible for looking up the user name in a particular 2211source of information, such as system 'passwd' database, an SQL 2212database, etc. The search is terminated when one of the functions finds 2213the name in question or the list is exhausted. In the former case, the 2214account is local, in the latter it is not. This concept is discussed in 2215detail in *note Authentication: (mailutils)authentication.). Here we 2216will give only some practical advices for implementing it in 'mailfromd' 2217filters. 2218 2219 The actual list of available authorization modules depends on your 2220'mailutils' installation. Usually it includes, apart from traditional 2221UNIX 'passwd' database, the functions for verifying PAM, RADIUS and SQL 2222database accounts. Each of the authorization methods is configured 2223using special configuration file statements. For the description of the 2224Mailutils configuration files, *Note Mailutils Configuration File: 2225(mailutils)configuration. You can obtain the template for 'mailfromd' 2226configuration by running 'mailfromd --config-help'. 2227 2228 For example, the following 'mailfromd.conf' file: 2229 2230 auth { 2231 authorization pam:system; 2232 } 2233 2234 pam { 2235 service mailfromd; 2236 } 2237 2238sets up the authorization using PAM and system 'passwd' database. The 2239name of PAM service to use is 'mailfromd'. 2240 2241 The function 'validuser' is often used together with 'dbmap', as in 2242the example below: 2243 2244 #pragma dbprop /etc/mail/aliases.db null 2245 2246 if dbmap("/etc/mail/aliases.db", localpart($rcpt_addr)) 2247 and validuser(localpart($rcpt_addr)) 2248 ... 2249 fi 2250 2251 For more information about 'dbmap' function, see *note dbmap::. For 2252a description of 'dbprop' pragma, see *note Database functions::. 2253 2254 2255File: mailfromd.info, Node: Databases, Next: Testing Filter Scripts, Prev: Local Account Verification, Up: Tutorial 2256 22573.15 Databases 2258============== 2259 2260Some 'mailfromd' functions use DBM databases to save their persistent 2261state data. Each database has a unique "identifier", and is assigned 2262several pieces of information for its maintenance: the database "file 2263name" and the "expiration period", i.e. the time after which a record 2264is considered expired. 2265 2266 To obtain the list of available databases along with their 2267preconfigured settings, run 'mailfromd --show-defaults'. You will see 2268an output similar to this: 2269 2270 version: 8.10 2271 script file: /etc/mailfromd.mf 2272 preprocessor: /usr/bin/m4 -s 2273 user: mail 2274 statedir: /var/run/mailfromd 2275 socket: unix:/var/run/mailfromd/mailfrom 2276 pidfile: /var/run/mailfromd/mailfromd.pid 2277 default syslog: blocking 2278 supported databases: gdbm, bdb 2279 default database type: bdb 2280 optional features: GeoIP 2281 greylist database: /var/run/mailfromd/greylist.db 2282 greylist expiration: 86400 2283 tbf database: /var/run/mailfromd/tbf.db 2284 tbf expiration: 86400 2285 rate database: /var/run/mailfromd/rates.db 2286 rate expiration: 86400 2287 cache database: /var/run/mailfromd/mailfromd.db 2288 cache positive expiration: 86400 2289 cache negative expiration: 43200 2290 2291 The text below 'optional features' line describes the available 2292built-in databases. Notice that the 'cache' database, in contrast to 2293the rest of databases, has two expiration periods associated with it. 2294This is explained in the next subsection. 2295 2296* Menu: 2297 2298* Database Formats:: 2299* Basic Database Operations:: 2300* Database Maintenance:: 2301 2302 2303File: mailfromd.info, Node: Database Formats, Next: Basic Database Operations, Up: Databases 2304 23053.15.1 Database Formats 2306----------------------- 2307 2308The version 8.10 runs the following database types (or "formats"): 2309 2310'cache' 2311 "Cache database" keeps the information about external emails, 2312 obtained using sender verification functions (*note Checking Sender 2313 Address::). The key entry to this database is an email address or 2314 EMAIL:SENDER-IP string, for addresses checked using strict 2315 verification. The data its stores for each key are: 2316 2317 1. Address validity. This field can be either 'success' or 2318 'not_found', meaning the address is confirmed to exists or it 2319 is not. 2320 2321 2. The time when the entry was entered into the database. It is 2322 used to check for expired entries. 2323 2324 The 'cache' database has two expiration periods: a "positive 2325 expiration" period, that is applied to entries with the first field 2326 set to 'success', and a "negative expiration" period, applied to 2327 entries marked as 'not_found'. 2328 2329'rate' 2330 The mail sending rate data, maintained by 'rate' function (*note 2331 Rate limiting functions::). A record consists of the following 2332 fields: 2333 2334 timestamp 2335 The time when the entry was entered into the database. 2336 2337 interval 2338 Interval during which the rate was measured (seconds). 2339 2340 count 2341 Number of mails sent during this interval. 2342 2343'tbf' 2344 This database is maintained by 'tbf_rate' function (*note TBF::). 2345 Each record represents a single bucket and consists of the 2346 following keys: 2347 2348 timestamp 2349 Timestamp of most recent token, as a 64-bit unsigned integer 2350 (microseconds resolution). 2351 2352 expirytime 2353 Estimated time when this bucket expires (seconds since epoch). 2354 2355 tokens 2356 Number of tokens in the bucket ('size_t'). 2357 2358'greylist' 2359 This database is maintained by 'greylist' function (*note 2360 Greylisting::). Each record holds only the timestamp. Its 2361 semantics depends on the greylisting implementation in use (*note 2362 greylisting types::). In traditional implementation, it is the 2363 time when the entry was entered into the database. In Con Tassios 2364 implementation, it is the time when the greylisting period expires. 2365 2366 2367File: mailfromd.info, Node: Basic Database Operations, Next: Database Maintenance, Prev: Database Formats, Up: Databases 2368 23693.15.2 Basic Database Operations 2370-------------------------------- 2371 2372The 'mfdbtool' utility is provided for performing various operations on 2373the 'mailfromd' database. 2374 2375 To list the contents of a database, use '--list' option. When used 2376without any arguments it will list the 'cache' database: 2377 2378 $ mfdbtool --list 2379 abrakat@mail.com success Thu Aug 24 15:28:58 2006 2380 baccl@EDnet.NS.CA not_found Fri Aug 25 10:04:18 2006 2381 bhzxhnyl@chello.pl not_found Fri Aug 25 10:11:57 2006 2382 brqp@aaanet.ru:24.1.173.165 not_found Fri Aug 25 14:16:06 2006 2383 2384 You can also list data for any particular key or keys. To do so, 2385give the keys as arguments to 'mfdbtool': 2386 2387 $ mfdbtool --list abrakat@mail.com brqp@aaanet.ru:24.1.173.165 2388 abrakat@mail.com success Thu Aug 24 15:28:58 2006 2389 brqp@aaanet.ru:24.1.173.165 not_found Fri Aug 25 14:16:06 2006 2390 2391 To list another database, give its format identifier with the 2392'--format' ('-H') option. For example, to list the 'rate' database: 2393 2394 $ mfdbtool --list --format=rate 2395 sam@mail.net-62.12.4.3 Wed Sep 6 19:41:42 2006 139 3 0.0216 6.82e-06 2396 axw@rame.com-59.39.165.172 Wed Sep 6 20:26:24 2006 0 1 N/A N/A 2397 2398 The '--format' option can be used with any database management 2399option, described below. 2400 2401 Another useful operation you can do while listing 'rate' database is 2402the prediction of "estimated time of sending", i.e. the time when the 2403user will be able to send mail if currently his mail sending rate has 2404exceeded the limit. This is done using '--predict' option. The option 2405takes an argument, specifying the mail sending rate limit, e.g. (the 2406second line is split for readability): 2407 2408 $ mfdbtool --predict="180 per 1 minute" 2409 ed@fae.net-21.10.1.2 Wed Sep 13 03:53:40 2006 0 1 N/A N/A; free to send 2410 service@19.netlay.com-69.44.129.19 Wed Sep 13 15:46:07 2006 7 2 2411 0.286 0.0224; in 46 sec. on Wed Sep 13 15:49:00 2006 2412 2413Notice, that there is no need to use '--list --format=rate' along with 2414this option, although doing so is not an error. 2415 2416 To delete an entry from the database, use '--delete' option, for 2417example: 'mfdbtool --delete abrakat@mail.com'. You can give any number 2418of keys to delete in the command line. 2419 2420 2421File: mailfromd.info, Node: Database Maintenance, Prev: Basic Database Operations, Up: Databases 2422 24233.15.3 Database Maintenance 2424--------------------------- 2425 2426There are two principal operations of database management: expiration 2427and compaction. "Expiration" consists in removing expired entries from 2428the database. In fact, it is rarely needed, since the expired entries 2429are removed in the process of normal 'mailfromd' work. Nevertheless, a 2430special option is provided in case an explicit expiration is needed (for 2431example, before dumping the database to another format, to avoid 2432transferring useless information). 2433 2434 The command line option '--expire' instructs 'mfdbtool' to delete 2435expired entries from the specified database. As usual, the database is 2436specified using '--format' option. If it is not given explicitly, 2437'cache' is assumed. 2438 2439 While removing expired entries the space they occupied is marked as 2440free, so it can be used by subsequent inserts. The database does not 2441shrink after expiration is finished. To actually return the unused 2442space to the file system you should "compact" your database. 2443 2444 This is done by running 'mfdbtool --compact' (and, optionally, 2445specifying the database to operate upon with '--format' option). 2446Notice, that compacting a database needs roughly as much disk space on 2447the partition where the database resides as is currently used by the 2448database. Database compaction runs in three phases. First, the 2449database is scanned and all non-expired records are stored in the 2450memory. Secondly, a temporary database is created in the state 2451directory and all the cached entries are flushed into it. This database 2452is named after the PID of the running 'mfdbtool' process. Finally, the 2453temporary database is renamed to the source database. 2454 2455 Both '--compact' and '--expire' can be applied to all databases by 2456combining them with '--all'. It is useful, for example, in 'crontab' 2457files. For example, I have the following monthly job in my 'crontab': 2458 2459 0 1 1 * * /usr/bin/mfdbtool --compact --all 2460 2461 2462File: mailfromd.info, Node: Testing Filter Scripts, Next: Run Mode, Prev: Databases, Up: Tutorial 2463 24643.16 Testing Filter Scripts 2465=========================== 2466 2467It is important to check your filter script before actually starting to 2468use it. There are several ways to do so. 2469 2470 To test the syntax of your filter script, use the '--lint' option. 2471It will cause 'mailfromd' to exit immediately after attempting to 2472compile the script file. If the compilation succeeds, the program will 2473exit with code 0. Otherwise, it will exit with error code 78 2474('configuration error'). In the latter case, 'mailfromd' will also 2475print a diagnostic message, describing the error along with the exact 2476location where the error was diagnosed, for example: 2477 2478 mailfromd: /etc/mailfromd.mf:39: syntax error, unexpected reject 2479 2480 The error location is indicated by the name of the file and the 2481number of the line when the error occurred. By using the 2482'--location-column' option you instruct 'mailfromd' to also print the 2483"column number". E.g. with this option the above error message may 2484look like: 2485 2486 mailfromd: /etc/mailfromd.mf:39.12 syntax error, unexpected reject 2487 2488 Here, '39' is the line and '12' is the column number. 2489 2490 For complex scripts you may wish to obtain a listing of variables 2491used in the script. This can be achieved using '--xref' command line 2492option: 2493 2494 The output it produces consists of four columns: 2495 2496Variable name 2497Data type 2498 Either 'number' or 'string'. 2499Offset in data segment 2500 Measured in words. 2501References 2502 A comma-separated list of locations where the variable was 2503 referenced. Each location is represented as FILE:LINE. If several 2504 locations pertain to the same FILE, the file name is listed only 2505 once. 2506 2507Here is an example of the cross-reference output: 2508 2509 $ mailfromd --xref 2510 Cross-references: 2511 ----------------- 2512 cache_used number 5 /etc/mailfromd.mf:48 2513 clamav_virus_name string 9 /etc/mailfromd.mf:240,240 2514 db string 15 /etc/mailfromd.mf:135,194,215 2515 dns_record_ttl number 16 /etc/mailfromd.mf:136,172,173 2516 ehlo_domain string 11 2517 gltime number 13 /etc/mailfromd.mf:37,219,220,222,223 2518 greylist_seconds_left number 1 /etc/mailfromd.mf:220,226,227 2519 last_poll_host string 2 2520 2521 If the script passes syntax check, the next step is often to test if 2522it works as you expect it to. This is done with '--test' ('-t') command 2523line option. This option runs the 'envfrom' handler (or another one, 2524see below) and prints the result of its execution. 2525 2526 When running your script in test mode, you will need to supply the 2527values of 'Sendmail' macros it needs. You do this by placing the 2528necessary assignments in the command line. For example, this is how to 2529supply initial values for 'f' and 'client_addr' macros: 2530 2531 $ mailfromd --test f=gray@gnu.org client_addr=127.0.0.1 2532 2533 You may also need to alter initial values of some global variables 2534your script uses. To do so, use '-v' ('--variable') command line 2535option. This option takes a single argument consisting of the variable 2536name and its initial value, separated by an equals sign. For example, 2537here is how to change the value of 'ehlo_domain' global variable: 2538 2539 $ mailfromd -v ehlo_domain=mydomain.org 2540 2541 The '--test' option is often useful in conjunction with options 2542'--debug', '--trace' and '--transcript' (*note Logging and Debugging::. 2543The following example shows what the author got while debugging the 2544filter script described in *note Filter Script Example::: 2545 2546 $ mailfromd --test --debug=50 f=gray@gnu.org client_addr=127.0.0.1 2547 MX 20 mx20.gnu.org 2548 MX 10 mx10.gnu.org 2549 MX 10 mx10.gnu.org 2550 MX 20 mx20.gnu.org 2551 getting cache info for gray@gnu.org 2552 found status: success (0), time: Thu Sep 14 14:54:41 2006 2553 getting rate info for gray@gnu.org-127.0.0.1 2554 found time: 1158245710, interval: 29, count: 5, rate: 0.172414 2555 rate for gray@gnu.org-127.0.0.1 is 0.162162 2556 updating gray@gnu.org-127.0.0.1 rates 2557 SET REPLY 450 4.7.0 Mail sending rate exceeded. Try again later 2558 State envfrom: tempfail 2559 2560 To test any handler, other than 'envfrom', give its name as the 2561argument to '--test' option. Since this argument is optional, it is 2562important that it be given immediately after the option, without any 2563intervening white space, for example 'mailfromd --test=helo', or 2564'mailfromd -thelo'. 2565 2566 This method allows to test one handler at a time. To test the script 2567as a whole, use 'mtasim' utility. When started it enters interactive 2568mode, similar to that of 'sendmail -bs', where it expects SMTP commands 2569on its standard input and sends answers to the standard output. The 2570'--port=auto' command line option instructs it to start 'mailfromd' and 2571to create a unique socket for communication with it. For the detailed 2572description of the program and the ways to use it, *Note mtasim::. 2573 2574 2575File: mailfromd.info, Node: Run Mode, Next: Logging and Debugging, Prev: Testing Filter Scripts, Up: Tutorial 2576 25773.17 Run Mode 2578============= 2579 2580Mailfromd provides a special option that allows to run arbitrary MFL 2581scripts. This is an experimental feature, intended for future use of 2582MFL as a scripting language. 2583 2584 When given the '--run' command line option, 'mailfromd' loads the 2585script given in its command line and executes a function called 'main'. 2586 2587 The function main must be declared as: 2588 2589 func main(...) returns number 2590 2591 Mailfromd passes all command line arguments that follow the script 2592name as arguments to that function. When the function returns, its 2593return value is used by 'mailfromd' as exit code. 2594 2595 As an example, suppose the file 'script.mf' contains the following: 2596 2597 func main (...) 2598 returns number 2599 do 2600 loop for number i 1, 2601 while i <= $#, 2602 set i i + 1 2603 do 2604 echo "arg %i=" . $(i) 2605 done 2606 done 2607 2608 This function prints all its arguments (*Note variadic functions::, 2609for a detailed description of functions with variable number of 2610arguments). Now running: 2611 2612 $ mailfromd --run script.mf 1 file dest 2613 2614displays the following: 2615 2616 arg 1=1 2617 arg 2=file 2618 arg 3=dest 2619 2620 Note, that MFL does not have a direct equivalent of shell's '$0' 2621argument. If your function needs to know the name of the script that is 2622being executed, use '__file__' built-in constant instead (*note 2623__file__: Built-in constants. 2624 2625 You may name your start function with any name other than the default 2626'main'. In this case, give its name as an argument to the '--run' 2627option. This argument is optional, therefore it must be separated from 2628the option by an equals sign (with no whitespace from either side). For 2629example, given the command line below, 'mailfromd' loads the file 2630'script.mf' and execute the function named 'start': 2631 2632 $ mailfromd --run=start script.mf 2633 2634* Menu: 2635 2636* top-block:: The Top of a Script File. 2637* getopt:: Parsing Command Line Arguments. 2638 2639 2640File: mailfromd.info, Node: top-block, Next: getopt, Up: Run Mode 2641 26423.17.1 The Top of a Script File 2643------------------------------- 2644 2645The '--run' option makes it possible to use 'mailfromd' scripts as 2646standalone programs. The traditional way to do so was to set the 2647executable bit on the script file and to begin the script with the 2648"interpreter selector", i.e. the characters '#!' followed by the name 2649of the 'mailfromd' executable, e.g.: 2650 2651 #! /usr/sbin/mailfromd --run 2652 2653 This would cause the shell to invoke 'mailfromd' with the command 2654line constructed from the '--run' option, the name of the invoked script 2655file itself, and any actual arguments from the invocation. Once 2656invoked, 'mailfromd' would treat the initial '#!' line as a usual 2657single-line comment (*note Comments::). 2658 2659 However, the interpretation of the '#!' by shells has various 2660deficiencies, which depend on the actual shell being used. For example, 2661some shells pass any characters following the whitespace after the 2662interpreter name as a single argument, some others silently truncate the 2663command line after some number of characters, etc. This often make it 2664impossible to pass additional arguments to 'mailfromd'. For example, a 2665script which begins with the following line would most probably fail to 2666be executed properly: 2667 2668 #! /usr/sbin/mailfromd --no-config --run 2669 2670 To compensate for these deficiencies and to allow for more complex 2671invocation sequences, 'mailfromd' handles initial '#' in a special way. 2672If the first line of a source file begins with '#!/' or '#! /' (with a 2673single space between '!' and '/'), it is treated as a start of a 2674multi-line comment, which is closed by the two characters '!#' on a line 2675by themselves. 2676 2677 Thus, the correct way to begin a 'mailfromd' script is: 2678 2679 #! /usr/sbin/mailfromd --run 2680 !# 2681 2682 Using this feature, you can start the 'mailfromd' with arbitrary 2683shell code, provided it ends with an 'exec' statement invoking the 2684interpreter itself. For example: 2685 2686 #!/bin/sh 2687 exec /usr/sbin/mailfromd --no-config --run $0 $@ 2688 !# 2689 2690 func main(...) 2691 returns number 2692 do 2693 /* actual mfl code goes here */ 2694 done 2695 2696 Note the use of '$0' and '$@' to pass the actual script file name and 2697command line arguments to 'mailfromd'. 2698 2699 2700File: mailfromd.info, Node: getopt, Prev: top-block, Up: Run Mode 2701 27023.17.2 Parsing Command Line Arguments 2703------------------------------------- 2704 2705A special function is provided to break (parse) options in command 2706lines, and to check for legal options. It uses the GNU getopt routines 2707(*note getopt: (libc)Getopt.). 2708 2709 -- Built-in Function: string getopt (number ARGC, pointer ARGV, ...) 2710 The 'getopt' function parses the command line arguments, as 2711 supplied by ARGC and ARGV. The ARGC argument is the argument 2712 count, and ARGV is an opaque data structure, representing the array 2713 of arguments(1). The operator 'vaptr' (*note vaptr::) is provided 2714 to initialize this argument. 2715 2716 An argument that starts with '-' (and is not exactly '-' or '--'), 2717 is an option element. An argument that starts with a '-' is called 2718 "short" or "traditional" option. The characters of this element, 2719 except for the initial '-' are option characters. Each option 2720 character represents a separate option. An argument that starts 2721 with '--' is called "long" or "GNU" option. The characters of this 2722 element, except for the initial '--' form the "option name". 2723 2724 Options may have arguments. The argument to a short option is 2725 supplied immediately after the option character, or as the next 2726 word in command line. E.g., if option '-f' takes a mandatory 2727 argument, then it may be given either as '-farg' or as '-f arg'. 2728 The argument to a long option is either given immediately after it 2729 and separated from the option name by an equals sign (as 2730 '--file=arg'), or is given as the next word in the command line 2731 (e.g. '--file arg'). 2732 2733 If the option argument is optional, i.e. it may not necessarily be 2734 given, then only the first form is allowed (i.e. either '-farg' or 2735 '--file=arg'. 2736 2737 The '--' command line argument ends the option list. Any arguments 2738 following it are not considered options, even if they begin with a 2739 dash. 2740 2741 If 'getopt' is called repeatedly, it returns successively each of 2742 the option characters from each of the option elements (for short 2743 options) and each option name (for long options). In this case, 2744 the actual arguments are supplied only to the first invocation. 2745 Subsequent calls must be given two nulls as arguments. Such 2746 invocation instructs 'getopt' to use the values saved on the 2747 previous invocation. 2748 2749 When the function finds another option, it returns its character or 2750 name updating the external variable 'optind' (see below) so that 2751 the next call to 'getopt' can resume the scan with the following 2752 option. 2753 2754 When there are no more options left, or a '--' argument is 2755 encountered, 'getopt' returns an empty string. Then 'optind' gives 2756 the index in ARGV of the first element that is not an option. 2757 2758 The legitimate options and their characteristics are supplied in 2759 additional arguments to 'getopt'. Each such argument is a string 2760 consisting of two parts, separated by a vertical bar ('|'). Any 2761 one of these parts is optional, but at least one of them must be 2762 present. The first part specifies short option character. If it 2763 is followed by a colon, this character takes mandatory argument. 2764 If it is followed by two colons, this character takes an optional 2765 argument. If only the first part is present, the '|' separator may 2766 be omitted. Examples: 2767 2768 "c" 2769 "c|" 2770 Short option '-c'. 2771 2772 "f:" 2773 "f:|" 2774 Short option '-f', taking a mandatory argument. 2775 2776 "f::" 2777 "f::|" 2778 Short option '-f', taking an optional argument. 2779 2780 If the vertical bar is present and is followed by any characters, 2781 these characters specify the name of a long option, synonymous to 2782 the short one, specified by the first part. Any mandatory or 2783 optional arguments to the short option remain mandatory or optional 2784 for the corresponding long option. Examples: 2785 2786 "f:|file" 2787 Short option '-f', or long option '--file', requiring an 2788 argument. 2789 2790 "f::|file" 2791 Short option '-f', or long option '--file', taking an optional 2792 argument. 2793 2794 In any of the above cases, if this option appears in the command 2795 line, 'getopt' returns its short option character. 2796 2797 To define a long option without a short equivalent, begin it with a 2798 bar, e.g.: 2799 2800 "|help" 2801 2802 If this option is to take an argument, this is specified using the 2803 mechanism described above, except that the short option character 2804 is replaced with a minus sign. For example: 2805 2806 "-:|output" 2807 Long option '--output', which takes a mandatory argument. 2808 2809 "-::|output" 2810 Long option '--output', which takes an optional argument. 2811 2812 If an option is returned that has an argument in the command line, 2813 'getopt' stores this argument in the variable 'optarg'. 2814 2815 After each invocation, 'getopt' sets the variable 'optind' to the 2816 index of the next ARGV element to be parsed. Thus, when the list 2817 of options is exhausted and the function returned an empty string, 2818 'optind' contains the index of the the first element that is not an 2819 option. 2820 2821 When 'getopt' encounters an option that is not described in its 2822 arguments or if it detects a missing option argument it prints an 2823 error message using 'mailfromd' logging facilities, stores the 2824 offending option in the variable 'optopt', and returns '?'. 2825 2826 If printing error message is not desired (e.g. the application is 2827 going to take care of error messaging), it can be disabled by 2828 setting the variable 'opterr' to '0'. 2829 2830 The third argument to 'getopt', called "controlling argument", may 2831 be used to control the behavior of the function. If it is a colon, 2832 it disables printing the error message for unrecognized options and 2833 missing option arguments (as setting 'opterr' to '0' does). In 2834 this case 'getopt' returns ':', instead of '?' to indicate missing 2835 option argument. 2836 2837 If the controlling argument is a plus sign, or the environment 2838 variable 'POSIXLY_CORRECT' is set, then option processing stops as 2839 soon as a non-option argument is encountered. By default, if 2840 options and non optional arguments are intermixed in ARGV, 'getopt' 2841 permutes them so that the options go first, followed by 2842 non-optional arguments. 2843 2844 If the controlling argument is '-', then each non-option element in 2845 ARGV is handled as if it were the argument of an option with 2846 character code 1 ('"\001"', in MFL notation. This can used by 2847 programs that are written to expect options and other ARGV-elements 2848 in any order and that care about the ordering of the two. 2849 2850 Any other value of the controlling argument is handled as an option 2851 definition. 2852 2853 A special language construct is provided to supply the second 2854argument (ARGV) to 'getopt' and similar functions: 2855 2856 vaptr(PARAM) 2857 2858where PARAM is a positional parameter, from which to start the array of 2859ARGV. For example: 2860 2861 func main(...) 2862 returns number 2863 do 2864 set rc getopt($#, vaptr($1), "|help") 2865 ... 2866 2867 Here, 'vaptr($1)' constructs the ARGV array from all the arguments, 2868supplied to the function 'main'. 2869 2870 To illustrate the use of 'getopt' function, let's suppose you write a 2871script that takes the following options: 2872 2873'-f FILE' 2874'--file=FILE' 2875 2876'--output[=DIR]' 2877 2878'--help' 2879 2880 Then, the corresponding 'getopt' invocation will be: 2881 2882 func main(...) 2883 returns number 2884 do 2885 loop for string rc getopt($#, vaptr($1), 2886 "f:|file", "-::|output", "h|help"), 2887 while rc != "", 2888 set rc getopt(0, 0) 2889 do 2890 switch rc 2891 do 2892 case "f": 2893 set file optarg 2894 case "output" 2895 set output 1 2896 set output_dir optarg 2897 case "h" 2898 help() 2899 default: 2900 return 1 2901 done 2902 ... 2903 2904 ---------- Footnotes ---------- 2905 2906 (1) When MFL has array data type, the second argument will change to 2907array of strings. 2908 2909 2910File: mailfromd.info, Node: Logging and Debugging, Next: Runtime errors, Prev: Run Mode, Up: Tutorial 2911 29123.18 Logging and Debugging 2913========================== 2914 2915Depending on its operation mode, 'mailfromd' tries to guess whether it 2916is appropriate to print its diagnostics and informational messages on 2917standard error or to send them to syslog. Standard error is assumed if 2918the program is run with one of the following command line options: 2919 2920 * '--test' (*note Testing Filter Scripts::) 2921 * '--run' (*note Run Mode::) 2922 * '--lint' (*note Testing Filter Scripts::) 2923 * '--dump-code' (*note Logging and Debugging Options::) 2924 * '--dump-grammar-trace' (*note Logging and Debugging Options::) 2925 * '--dump-lex-trace' (*note Logging and Debugging Options::) 2926 * '--dump-macros' (*note Logging and Debugging Options::) 2927 * '--dump-tree' (*note Logging and Debugging Options::) 2928 * '--xref' or '--dump-xref') (*note Testing Filter Scripts::) 2929 2930 If none of these are used, 'mailfromd' switches to syslog as soon as 2931it finishes its startup. There are two ways to communicate with the 2932'syslogd' daemon: using the 'syslog' function from the system 'libc' 2933library, which is a "blocking" implementation in most cases, or via 2934internal, "asynchronous", syslog implementation. Whether the latter is 2935compiled in and which of the implementation is used by default is 2936determined while compiling the package, as described in *note Using 2937non-blocking syslog: syslog-async. 2938 2939 The '--logger' command line option allows you to manually select the 2940diagnostic channel: 2941 2942'--logger=stderr' 2943 Log everything to the standard error. 2944 2945'--logger=syslog' 2946 Log to syslog. 2947 2948'--logger=syslog:async' 2949 Log to syslog using the asynchronous syslog implementation. 2950 2951 Another way to select the diagnostic channel is by using the 'logger' 2952statement in the configuration file. The statement takes the same 2953argument as its command line counterpart. 2954 2955 The rest of details regarding diagnostic output are controlled by the 2956'logging' configuration statement. 2957 2958 The default syslog facility is 'mail'; it can be changed using the 2959'--log-facility' command line option or 'facility' statement. Argument 2960in both cases is a valid facility name, i.e. one of: 'user', 'daemon', 2961'auth', 'authpriv', 'mail', and 'local0' through 'local7'. The argument 2962can be given in upper, lower or mixed cases, and it can be prefixed with 2963'log_': 2964 2965 Another syslog-related parameter that can be configured is the "tag", 2966which identifies 'mailfromd' messages. The default tag is the program 2967name. It is changed by the '--log-tag' ('-L' command line option and 2968the 'tag' logging statement. 2969 2970 The following example configures both the syslog facility and tag: 2971 2972 logging { 2973 facility local7; 2974 tag "mfd"; 2975 } 2976 2977 As any other UNIX utility, 'mailfromd' is very quiet unless it has 2978something important to communicate, such as, e.g. an error condition. A 2979set of command line options is provided for controlling the verbosity of 2980its output. 2981 2982 The '--trace' option enables tracing Sendmail actions executed during 2983message verifications. When this option is given, any 'accept', 2984'discard', 'continue', etc. triggered during execution of your filter 2985program will leave their traces in the log file. Here is an example of 2986how it looks like (syslog time stamp, tag and PID removed for 2987readability): 2988 2989 k8DHxvO9030656: /etc/mailfromd.mf:45: reject 550 5.1.1 Sender validity 2990 not confirmed 2991 2992This shows that while verifying the message with ID 'k8DHxvO9030656' the 2993'reject' action was executed by filter script '/etc/mailfromd.mf' at 2994line 45. 2995 2996 The use of message ID in the log deserves a special notice. The 2997program will always identify its log messages with the 'Message-Id', 2998when it is available. Your responsibility as an administrator is to 2999make sure it is available by configuring your MTA to export the macro 3000'i' to 'mailfromd'. The rule of thumb is: make 'i' available to the 3001very first handler 'mailfromd' executes. It is not necessary to export 3002it to the rest of the handlers, since 'mailfromd' will cache it. For 3003example, if your filter script contains 'envfrom' and 'envrcpt' 3004handlers, export 'i' for 'envfrom'. The exact instructions on how to 3005ensure it depend on the MTA you use. For 'Sendmail', refer to *note 3006Sendmail::. For MeTA1, see *note MeTA1::, and *note pmult-macros::. 3007For 'Postfix', see *note Postfix::. 3008 3009 To push log verbosity further, use the 'debug' configuration 3010statement (*note conf-debug::) or its command line equivalent, '--debug' 3011('-d', *note --debug::). Its argument is a "debugging level", whose 3012syntax is described in <http://mailutils.org/wiki/Debug_level>. 3013 3014 The debugging output is controlled by a set of levels, each of which 3015can be set independently of others. Each debug level consists of a 3016category name, which identifies the part of package for which additional 3017debugging is desired, and a level number, which indicates how verbose 3018should its output be. 3019 3020 Valid debug levels are: 3021 3022error 3023 Displays error conditions which are normally not reported, but 3024 passed to the caller layers for handling. 3025 3026trace0 through trace9 3027 Ten levels of verbosity, 'trace0' producing less output, 'trace9' 3028 producing the maximum amount of output. 3029 3030prot 3031 Displays network protocol interaction, where applicable. 3032 3033 The overall debugging level is specified as a list of individual 3034levels, delimited with semicolons. Each individual level can be 3035specified as one of: 3036 3037!CATEGORY 3038 Disables all levels for the specified category. 3039 3040CATEGORY 3041 Enables all levels for the specified category. 3042 3043CATEGORY.LEVEL 3044 For this category, enables all levels from 'error' to LEVEL, 3045 inclusive. 3046 3047CATEGORY.=LEVEL 3048 Enables only the given LEVEL in this CATEGORY. 3049 3050CATEGORY.!LEVEL 3051 Disables all levels from 'error' to LEVEL, inclusive, in this 3052 CATEGORY. 3053 3054CATEGORY.!=LEVEL 3055 Disables only the given LEVEL in this CATEGORY. 3056 3057CATEGORY.LEVELA-LEVELB 3058 Enables all levels in the range from LEVELA to LEVELB, inclusive. 3059 3060CATEGORY.!LEVELA-LEVELB 3061 Disables all levels in the range from LEVELA to LEVELB, inclusive. 3062 3063 Additionally, a comma-separated list of level specifications is 3064allowed after the dot. For example, the following specification: 3065 3066 acl.prot,!=trace9,!trace2 3067 3068enables in category acl all levels, except trace9, trace0, trace1, and 3069trace2. 3070 3071 Implementation and applicability of each level of debugging differs 3072between various categories. Categories built-in to mailutils are 3073described in <http://mailutils.org/wiki/Debug_level>. Mailfromd 3074introduces the following additional categories: 3075 3076db 3077 trace0 3078 Detailed debugging info about expiration and compaction. 3079 trace5 3080 List records being removed. 3081 3082dns 3083 trace8 3084 Verbose information about attempted DNS queries and their 3085 results. 3086 trace9 3087 Enables 'libadns' internal debugging. 3088 3089srvman 3090 trace0 3091 Additional information about normal conditions, such as 3092 subprocess exiting successfully or a remote party being 3093 allowed access by ACL. 3094 trace1 3095 Detailed transcript of server manager actions: startup, 3096 shutdown, subprocess cleanups, etc. 3097 trace3 3098 Additional info about fd sets. 3099 trace4 3100 Individual subserver status information. 3101 trace5 3102 Subprocess registration. 3103 3104pmult 3105 trace1 3106 Verbosely list incoming connections, functions being executed 3107 and erroneous conditions: missing headers in SMFIR_CHGHEADER, 3108 undefined macros, etc. 3109 trace2 3110 List milter requests being processed. 3111 trace7 3112 List SMTP body content in SMFIR_REPLBODY requests. 3113 error 3114 Verbosely list mild errors encountered: bad recipient 3115 addresses, etc. 3116 3117callout 3118 trace0 3119 Verification session transcript. 3120 trace1 3121 MX servers checks. 3122 trace5 3123 List emails being checked. 3124 trace9 3125 Additional info. 3126 3127main 3128 trace5 3129 Info about hostnames in relayed domain list 3130 3131engine 3132 Debugging of the virtual engine. 3133 trace5 3134 Message modification lists. 3135 trace6 3136 Debug message modification operations and Sendmail macros 3137 registered. 3138 trace7 3139 List SMTP stages ('xxfi_*' calls). 3140 trace9 3141 Cleanup calls. 3142 3143pp 3144 Preprocessor. 3145 3146 trace1 3147 Show command line of the preprocessor being run. 3148 3149prog 3150 trace8 3151 Stack operations 3152 trace9 3153 Debug exception state save/restore operations. 3154 3155spf 3156 error 3157 Mild errors. 3158 trace0 3159 List calls to 'spf_eval_record', 'spf_test_record', 3160 'spf_check_host_internal', etc. 3161 trace1 3162 General debug info. 3163 trace6 3164 Explicitly list A records obtained when processing the 'a' SPF 3165 mechanism. 3166 3167 Categories starting with 'bi_' debug built-in modules: 3168 3169bi_db 3170 Database functions. 3171 trace5 3172 List database look-ups. 3173 trace6 3174 Trace operations on the greylisting database. 3175 3176bi_sa 3177 SpamAssassin and ClamAV API. 3178 trace1 3179 Report the findings of the 'clamav' function. 3180 trace9 3181 Trace payload in interactions with 'spamd'. 3182 3183bi_io 3184 I/O functions. 3185 trace1 3186 Debug the following functions: 'open', 'spawn', 'write'. 3187 trace2 3188 Report stderr redirection. 3189 trace3 3190 Report external commands being run. 3191 3192bi_mbox 3193 Mailbox functions. 3194 trace1 3195 Report opened mailboxes. 3196 3197bi_other 3198 Other built-ins. 3199 trace1 3200 Report results of checks for existence of usernames. 3201 3202 For example, the following invocation enables levels up to 'trace2' 3203in category 'engine', all levels in category 'savsrv' and levels up to 3204'trace0' in category 'srvman': 3205 3206 $ mailfromd --debug='engine.trace2;savsrv;srvman.trace0' 3207 3208 You need to have sufficient knowledge about 'mailfromd' internal 3209structure to use this form of the '--debug' option. 3210 3211 To control the execution of the sender verification functions (*note 3212SMTP Callout functions::), you may use '--transcript' ('-X') command 3213line option which enables transcripts of SMTP sessions in the logs. 3214Here is an example of the output produced running 'mailfromd 3215--transcript': 3216 3217 k8DHxlCa001774: RECV: 220 spf-jail1.us4.outblaze.com ESMTP Postfix 3218 k8DHxlCa001774: SEND: HELO mail.gnu.org.ua 3219 k8DHxlCa001774: RECV: 250 spf-jail1.us4.outblaze.com 3220 k8DHxlCa001774: SEND: MAIL FROM: <> 3221 k8DHxlCa001774: RECV: 250 Ok 3222 k8DHxlCa001774: SEND: RCPT TO: <t1Kmx17Q@malaysia.net> 3223 k8DHxlCa001774: RECV: 550 <>: No thank you rejected: Account 3224 Unavailable: Possible Forgery 3225 k8DHxlCa001774: poll exited with status: not_found; sent 3226 "RCPT TO: <t1Kmx17Q@malaysia.net>", got "550 <>: No thank you 3227 rejected: Account Unavailable: Possible Forgery" 3228 k8DHxlCa001774: SEND: QUIT 3229 3230 3231File: mailfromd.info, Node: Runtime errors, Next: Notes, Prev: Logging and Debugging, Up: Tutorial 3232 32333.19 Runtime Errors 3234=================== 3235 3236A "runtime error" is a special condition encountered during execution of 3237the filter program, that makes further execution of the program 3238impossible. There are two kinds of runtime errors: fatal errors, and 3239uncaught exceptions. Whenever a runtime error occurs, 'mailfromd' 3240writes into the log file the following message: 3241 3242 RUNTIME ERROR near FILE:LINE: TEXT 3243 3244where FILE:LINE indicates approximate source file location where the 3245error occurred and TEXT gives the textual description of the error. 3246 3247Fatal runtime errors 3248-------------------- 3249 3250Fatal runtime errors are caused by a condition that is impossible to fix 3251at run time. For version 8.10 these are: 3252 3253Not enough memory 3254 There is not enough memory for the execution of the program. Try 3255 to make more memory available for 'mailfromd' or to reduce its 3256 memory requirements by rewriting your filter script. 3257 3258Out of stack space; increase #pragma stacksize 3259Heap overrun; increase #pragma stacksize 3260memory chunk too big to fit into heap 3261 These errors are reported when there is not enough space left on 3262 stack to perform the requested operation, and the attempt to resize 3263 the stack has failed. Usually 'mailfromd' expands the stack when 3264 the need arises (*note automatic stack resizing::). This runtime 3265 error indicates that there were no more memory available for stack 3266 expansion. Try to make more memory available for 'mailfromd' or to 3267 reduce its memory requirements by rewriting your filter script. 3268 3269Stack underflow 3270 Program attempted to pop a value off the stack but the stack was 3271 already empty. This indicates an internal error in the MFL 3272 compiler or 'mailfromd' runtime engine. If you ever encounter this 3273 error, please report it to <bug-mailfromd@gnu.org.ua>. Include the 3274 log fragment (about 10-15 lines before and after this log message) 3275 and your filter script. *Note Reporting Bugs::, for more 3276 information about bug reporting. 3277 3278pc out of range 3279 The "program counter" is out of allowed range. This is a severe 3280 error, indicating an internal inconsistency in 'mailfromd' runtime 3281 engine. If you encounter it, please report it to 3282 <bug-mailfromd@gnu.org.ua>. Include the log fragment (about 10-15 3283 lines before and after this log message) and your filter script. 3284 *Note Reporting Bugs::, for more information about how to report a 3285 bug. 3286 3287Programmatic runtime errors 3288--------------------------- 3289 3290These indicate a programmatic error in your filter script, which the MFL 3291compiler was unable to discover at compilation stage: 3292 3293Invalid exception number: N 3294 The 'throw' statement used a not existent exception number N. Fix 3295 the statement and restart 'mailfromd'. *Note throw::, for the 3296 information about 'throw' statement and see *note Exceptions::, for 3297 the list of available exception codes. 3298 3299No previous regular expression 3300 You have used a back-reference (*note Back references::), where 3301 there is no previous regular expression to refer to. Fix this line 3302 in your code and restart the program. 3303 3304Invalid back-reference number 3305 You have used a back-reference (*note Back references::), with a 3306 number greater than the number of available groups in the previous 3307 regular expression. For example: 3308 3309 if $f matches "(.*)@gnu.org" 3310 # Wrong: there is only one group in the regexp above! 3311 set x \2 3312 ... 3313 3314 Fix your code and restart the daemon. 3315 3316Uncaught exceptions 3317------------------- 3318 3319Another kind of runtime errors are "uncaught exceptions", i.e. 3320exceptional conditions for which no handler was installed (*Note 3321Exceptions::, for information on exceptions and on how to handle them). 3322These errors mean that the programmer (i.e. you), made no provision for 3323some specific condition. For example, consider the following code: 3324 3325 prog envfrom 3326 do 3327 if $f mx matches "yahoo.com" 3328 foo() 3329 fi 3330 done 3331 3332It is syntactically correct, but it overlooks the fact that 'mx matches' 3333may generate 'e_temp_failure' exception, if the underlying DNS query has 3334timed out (*note Special comparisons::). If this happens, 'mailfromd' 3335has no instructions on what to do next and reports an error. This can 3336easily be fixed using a 'catch' statement, e.g.: 3337 3338 prog envfrom 3339 do 3340 # Catch DNS errors 3341 catch e_temp_failure or e_failure 3342 do 3343 tempfail 451 4.1.1 "MX verification failed" 3344 done 3345 3346 if $f mx matches "yahoo.com" 3347 foo() 3348 fi 3349 done 3350 3351 Another common case are undefined Sendmail macros. In this case the 3352'e_macroundef' exception is generated: 3353 3354 RUNTIME ERROR near foo.c:34: Macro not defined: {client_adr} 3355 3356These can be caused either by misspelling the macro name (as in the 3357example message above) or by failing to export the required name in 3358Sendmail milter configuration (*note exporting macros::). This error 3359should be fixed either in your source code or in 'sendmail.cf' file, but 3360if you wish to provide a special handling for it, you can use the 3361following catch statement: 3362 3363 catch e_macroundef 3364 do 3365 ... 3366 done 3367 3368 Sometimes the location indicated with the runtime error message is 3369not enough to trace the origin of the error. For example, an error can 3370be generated explicitly with 'throw' statement (*note throw::): 3371 3372 RUNTIME ERROR near match_cidr.mf:30: invalid CIDR (text) 3373 3374 If you look in module 'match_cidr.mf', you will see the following 3375code (line numbers added for reference): 3376 3377 23 func match_cidr(string ipstr, string cidr) returns number 3378 24 do 3379 25 number netmask 3380 26 3381 27 if cidr matches '^(([0-9]{1,3}\.){3}[0-9]{1,3})/([0-9][0-9]?)' 3382 28 return inet_aton(ipstr) & len_to_netmask(\3) = inet_aton(\1) 3383 29 else 3384 30 throw invcidr "invalid CIDR (%cidr)" 3385 31 fi 3386 32 return 0 3387 33 done 3388 3389 Now, it is obvious that the value of 'cidr' argument to 'match_cidr' 3390was wrong, but how to find the caller that passed the wrong value to it? 3391The special command line option '--stack-trace' is provided for this. 3392This option enables dumping "stack traces" when a fatal error occurs. 3393The traces contain information about function calls. Continuing our 3394example, using the '--stack-trace' option you will see the following 3395diagnostics: 3396 3397 RUNTIME ERROR near match_cidr.mf:30: invalid CIDR (127%) 3398 mailfromd: Stack trace: 3399 mailfromd: 0077: match_cidr.mf:30: match_cidr 3400 mailfromd: 0096: test.mf:13: bar 3401 mailfromd: 0110: mailfromd.mf:18: foo 3402 mailfromd: Stack trace finishes 3403 mailfromd: Execution of the configuration program was not finished 3404 3405 Each trace line describes one stack frame. The lines appear in the 3406order of most recently called to least recently called. Each frame 3407consists of: 3408 3409 1. Value of the program counter at the time of its execution; 3410 2. Source code location, if available; 3411 3. Name of the function called. 3412 3413 Thus, the example above can be read as: "the function 'match_cidr' 3414was called by the function 'bar' in file 'test.mf' at line 13. This 3415function was called from the function 'bar', in file 'test.mf' at line 341613. In its turn, 'bar' was called by the function 'foo', in file 3417'mailfromd.mf' at line 18". 3418 3419 Examining caller functions will help you localize the source of the 3420error and fix it. 3421 3422 You can also request a stack trace any place in your code, by calling 3423the 'stack_trace' function. This can be useful for debugging, or in 3424your 'catch' statements. 3425 3426 3427File: mailfromd.info, Node: Notes, Prev: Runtime errors, Up: Tutorial 3428 34293.20 Notes and Cautions 3430======================= 3431 3432This section discusses some potential culprits in the MFL. 3433 3434 It is important to execute special caution when writing format 3435strings for 'sprintf' (*note String formatting::) and 'strftime' (*note 3436strftime::) functions. They use '%' as a character introducing 3437conversion specifiers, while the same character is used to expand a MFL 3438variable within a string. To prevent this misinterpretation, always 3439enclose format specification in _single quotes_ (*note 3440singe-vs-double::). To illustrate this, let's consider the following 3441example: 3442 3443 echo sprintf ("Mail from %s", $f) 3444 3445 If a variable 's' is not declared, this line will produce the 3446'Variable s is not defined' error message, which will allow you to 3447identify and fix the bug. The situation is considerably worse if 's' is 3448declared. In that case you will see no warning message, as the 3449statement is perfectly valid, but at the run-time the variable 's' will 3450be interpreted within the format string, and its value will replace 3451'%s'. To prevent this from happening, single quotes must be used: 3452 3453 echo sprintf ('Mail from %s', $f) 3454 3455 This does not limit the functionality, since there is no need to fall 3456back to variable interpretation in format strings. 3457 3458 Yet another dangerous feature of the language is the way to refer to 3459variable and constant names within literal strings. To expand a 3460variable or a constant the same notation is used (*Note Variables::, and 3461*note Constants::). Now, lets consider the following code: 3462 3463 const x 2 3464 string x "X" 3465 3466 prog envfrom 3467 do 3468 echo "X is %x" 3469 done 3470 3471 Does '%x' in 'echo' refers to the variable or to the constant? The 3472correct answer is 'to the variable'. When executed, this code will 3473print 'X is X'. 3474 3475 As of version 8.10, 'mailfromd' will always print a diagnostic 3476message whenever it stumbles upon a variable having the same name as a 3477previously defined constant or vice versa. The resolution of such name 3478clashes is described in detail in *Note variable--constant shadowing::. 3479 3480 Future versions of the program may provide a non-ambiguous way of 3481referring to variables and constants from literal strings. 3482 3483 3484File: mailfromd.info, Node: MFL, Next: Library, Prev: Tutorial, Up: Top 3485 34864 Mail Filtering Language 3487************************* 3488 3489The "mail filtering language", or MFL, is a special language designed 3490for writing filter scripts. It has a simple syntax, similar to that of 3491Bourne shell. In contrast to the most existing programming languages, 3492MFL does not have any special terminating or separating characters 3493(like, e.g. newlines and semicolons in shell)(1). All syntactical 3494entities are separated by any amount of white-space characters (i.e. 3495spaces, tabulations or newlines). 3496 3497 The following sections describe MFL syntax in detail. 3498 3499* Menu: 3500 3501* Comments:: Comments. 3502* Pragmas:: Pragmatic comments. 3503* Data Types:: 3504* Numbers:: 3505* Literals:: 3506* Here Documents:: 3507* Sendmail Macros:: 3508* Constants:: 3509* Variables:: 3510* Back references:: 3511* Handlers:: 3512* begin/end:: 3513* Functions:: Functions. 3514* Expressions:: Expressions. 3515* Shadowing:: Variable and Constant Shadowing. 3516* Statements:: 3517* Conditionals:: Conditional Statements. 3518* Loops:: Loop Statements. 3519* Exceptions:: Exceptional Conditions and their Handling. 3520* Polling:: Sender Verification Tests. 3521* Modules:: Modules are Collections of Useful Functions. 3522* Preprocessor:: Input Text Is Preprocessed. 3523* Filter Script Example:: A Working Filter Script Explained. 3524* Reserved Words:: A Reference List of Reserved Words. 3525 3526 ---------- Footnotes ---------- 3527 3528 (1) There are two noteworthy exceptions: 'require' and 'from ... 3529import' statements, which must be terminated with a period. *Note 3530import::. 3531 3532 3533File: mailfromd.info, Node: Comments, Next: Pragmas, Up: MFL 3534 35354.1 Comments 3536============ 3537 3538Two types of comments are allowed: C-style, enclosed between '/*' and 3539'*/', and shell-style, starting with '#' character and extending up to 3540the end of line: 3541 3542 /* This is 3543 a comment. */ 3544 # And this too. 3545 3546 There are, however, several special cases, where the characters 3547following '#' are not ignored. 3548 3549 If the first line begins with '#!/' or '#! /', this is treated as a 3550start of a multi-line comment, which is closed by the characters '!#' on 3551a line by themselves. This feature allows for writing sophisticated 3552scripts. *Note top-block::, for a detailed description. 3553 3554 If '#' is followed by word 'include' (with optional whitespace 3555between them), this statement requires inclusion of the specified file, 3556as in C. There are two forms of the '#include' statement: 3557 3558 1. '#include <FILE>' 3559 2. '#include "FILE"' 3560 3561 The quotes around FILE in the second form quotes are optional. 3562 3563 Both forms are equivalent if FILE is an absolute file name. 3564Otherwise, the first form will look for FILE in the "include search 3565path". The second one will look for it in the current working directory 3566first, and, if not found there, in the include search path. 3567 3568 The default include search path is: 3569 3570 1. 'PREFIX/share/mailfromd/8.10/include' 3571 2. 'PREFIX/share/mailfromd/include' 3572 3. '/usr/share/mailfromd/include' 3573 4. '/usr/local/share/mailfromd/include' 3574 3575 Where PREFIX is the installation prefix. 3576 3577 New directories can be appended in front of it using '-I' 3578('--include') command line option, or 'include-path' configuration 3579statement (*note include-path: conf-base.). 3580 3581 For example, invoking 3582 3583 $ mailfromd -I/var/mailfromd -I/com/mailfromd 3584 3585creates the following include search path 3586 3587 1. '/var/mailfromd' 3588 2. '/com/mailfromd' 3589 3. 'PREFIX/share/mailfromd/8.10/include' 3590 4. 'PREFIX/share/mailfromd/include' 3591 5. '/usr/share/mailfromd/include' 3592 6. '/usr/local/share/mailfromd/include' 3593 3594 Along with '#include', there is also a special form '#include_once', 3595that has the same syntax: 3596 3597 #include_once <FILE> 3598 #include_once "FILE" 3599 3600 This form works exactly as '#include', except that, if the FILE has 3601already been included, it will not be included again. As the name 3602suggests, it will be included only once. 3603 3604 This form should be used to prevent re-inclusions of a code, which 3605can cause problems due to function redefinitions, variable reassignments 3606etc. 3607 3608 A line in the form 3609 3610 #line NUMBER "IDENTIFIER" 3611 3612causes the MFL compiler to believe, for purposes of error diagnostics, 3613that the line number of the next source line is given by NUMBER and the 3614current input file is named by IDENTIFIER. If the identifier is absent, 3615the remembered file name does not change. 3616 3617 3618File: mailfromd.info, Node: Pragmas, Next: Data Types, Prev: Comments, Up: MFL 3619 36204.2 Pragmatic comments 3621====================== 3622 3623If '#' is immediately followed by word 'pragma' (with optional 3624whitespace between them), such a construct introduces a "pragmatic 3625comment", i.e. an instruction that controls some configuration setting. 3626 3627 The available pragma types are described in the following 3628subsections. 3629 3630* Menu: 3631 3632* prereq:: Pragma prereq. 3633* stacksize:: Pragma stacksize. 3634* regex:: Pragma regex. 3635* dbprop:: Pragma dbprop. 3636* greylist:: Pragma greylist. 3637* miltermacros:: Pragma miltermacros. 3638* provide-callout:: Pragma provide-callout. 3639 3640 3641File: mailfromd.info, Node: prereq, Next: stacksize, Up: Pragmas 3642 36434.2.1 Pragma prereq 3644------------------- 3645 3646The '#pragma prereq' statement ensures that the correct 'mailfromd' 3647version is used to compile the source file it appears in. It takes 3648version number as its arguments and produces a compilation error if the 3649actual 'mailfromd' version number is earlier than that. For example, 3650the following statement: 3651 3652 #pragma prereq 7.0.94 3653 3654results in error if compiled with 'mailfromd' version 7.0.93 or prior. 3655 3656 3657File: mailfromd.info, Node: stacksize, Next: regex, Prev: prereq, Up: Pragmas 3658 36594.2.2 Pragma stacksize 3660---------------------- 3661 3662The 'stacksize' pragma sets the initial size of the run-time stack and 3663may also define the policy of its growing, in case it becomes full. The 3664default stack size is 4096 words. You may need to increase this number 3665if your configuration program uses recursive functions or does an 3666excessive amount of string manipulations. 3667 3668 -- pragma: stacksize size [incr [max]] 3669 Sets stack size to SIZE units. Optional INCR and MAX define stack 3670 growth policy (see below). The default "units" are words. The 3671 following example sets the stack size to 7168 words: 3672 3673 #pragma stacksize 7168 3674 3675 The SIZE may end with a "unit size" suffix: 3676 3677 Suffix Meaning 3678 ------------------------------------------------------------------- 3679 k Kiloword, i.e. 1024 words 3680 m Megawords, i.e. 1048576 words 3681 g Gigawords, 3682 t Terawords (ouch!) 3683 3684 Table 4.1: Unit Size Suffix 3685 3686 File suffixes are case-insensitive, so the following two pragmas 3687 are equivalent and set the stack size to '7*1048576 = 7340032' 3688 words: 3689 3690 #pragma stacksize 7m 3691 #pragma stacksize 7M 3692 3693 When the MFL engine notices that there is no more stack space 3694 available, it attempts to expand the stack. If this attempt 3695 succeeds, the operation continues. Otherwise, a runtime error is 3696 reported and the execution of the filter stops. 3697 3698 The optional INCR argument to '#pragma stacksize' defines growth 3699 policy for the stack. Two growth policies are implemented: "fixed 3700 increment policy", which expands stack in a fixed number of 3701 "expansion chunks", and "exponential growth policy", which 3702 duplicates the stack size until it is able to accommodate the 3703 needed number of words. The fixed increment policy is the default. 3704 The default chunk size is 4096 words. 3705 3706 If INCR is the word 'twice', the duplicate policy is selected. 3707 Otherwise INCR must be a positive number optionally suffixed with a 3708 size suffix (see above). This indicates the expansion chunk size 3709 for the fixed increment policy. 3710 3711 The following example sets initial stack size to 10240, and 3712 expansion chunk size to 2048 words: 3713 3714 #pragma stacksize 10M 2K 3715 3716 The pragma below enables exponential stack growth policy: 3717 3718 #pragma stacksize 10240 twice 3719 3720 In this case, when the run-time evaluator hits the stack size 3721 limit, it expands the stack to twice the size it had before. So, 3722 in the example above, the stack will be sequentially expanded to 3723 the following sizes: 20480, 40960, 81920, 163840, etc. 3724 3725 The optional MAX argument defines the maximum size of the stack. 3726 If stack grows beyond this limit, the execution of the script will 3727 be aborted. 3728 3729 If you are concerned about the execution time of your script, you may 3730wish to avoid stack reallocations. To help you find out the optimal 3731stack size, each time the stack is expanded, 'mailfromd' issues a 3732warning in its log file, which looks like this: 3733 3734 warning: stack segment expanded, new size=8192 3735 3736 You can use these messages to adjust your stack size configuration 3737settings. 3738 3739 3740File: mailfromd.info, Node: regex, Next: dbprop, Prev: stacksize, Up: Pragmas 3741 37424.2.3 Pragma regex 3743------------------ 3744 3745The '#pragma regex', controls compilation of regular expressions. You 3746can use any number of such pragma directives in your 'mailfromd.mf'. 3747The scope of '#pragma regex' extends to the next occurrence of this 3748directive or to the end of the script file, whichever occurs first. 3749 3750 -- pragma: regex [push|pop] flags 3751 The optional PUSH|POP parameter is one of the words 'push' or 'pop' 3752 and is discussed in detail below. The FLAGS parameter is a 3753 whitespace-separated list of "regex flags". Each regex-flag is a 3754 word specifying some regex feature. It can be preceded by '+' to 3755 enable this feature (this is the default), by '-' to disable it or 3756 by '=' to reset regex flags to its value. Valid regex-flags are: 3757 3758 'extended' 3759 Use POSIX Extended Regular Expression syntax when interpreting 3760 regex. If not set, POSIX Basic Regular Expression syntax is 3761 used. 3762 3763 'icase' 3764 Do not differentiate case. Subsequent regex searches will be 3765 case insensitive. 3766 3767 'newline' 3768 "Match-any-character" operators don't match a newline. 3769 3770 A non-matching list ('[^...]') not containing a newline does 3771 not match a newline. 3772 3773 "Match-beginning-of-line" operator ('^') matches the empty 3774 string immediately after a newline. 3775 3776 "Match-end-of-line" operator ('$') matches the empty string 3777 immediately before a newline. 3778 3779 For example, the following pragma enables POSIX extended, case 3780 insensitive matching (a good thing to start your 'mailfromd.mf' 3781 with): 3782 3783 #pragma regex +extended +icase 3784 3785 Optional modifiers 'push' and 'pop' can be used to maintain a stack 3786of regex flags. The statement 3787 3788 #pragma regex push [FLAGS] 3789 3790saves current regex flags on stack and then optionally modifies them as 3791requested by FLAGS. 3792 3793 The statement 3794 3795 #pragma regex pop [FLAGS] 3796 3797does the opposite: restores the current regex flags from the top of 3798stack and applies FLAGS to it. 3799 3800 This statement is useful in module and include files to avoid 3801disturbing user regex settings. E.g.: 3802 3803 #pragma regex push +extended +icase 3804 . 3805 . 3806 . 3807 #pragma regex pop 3808 3809 3810File: mailfromd.info, Node: dbprop, Next: greylist, Prev: regex, Up: Pragmas 3811 38124.2.4 Pragma dbprop 3813------------------- 3814 3815 -- pragma: dbprop pattern prop ... 3816 This pragma configures properties for a DBM database. *Note 3817 Database functions::, for its detailed description. 3818 3819 3820File: mailfromd.info, Node: greylist, Next: miltermacros, Prev: dbprop, Up: Pragmas 3821 38224.2.5 Pragma greylist 3823--------------------- 3824 3825 -- pragma: greylist type 3826 Selects the greylisting implementation to use. Allowed values for 3827 TYPE are: 3828 3829 traditional 3830 gray 3831 Use the traditional greylisting implementation. This is the 3832 default. 3833 3834 con-tassios 3835 ct 3836 Use Con Tassios greylisting implementation. 3837 3838 *Note greylisting types::, for a detailed description of these 3839 greylisting implementations. 3840 3841 Notice, that this pragma can be used only once. A second use of this 3842pragma would constitute an error, because you cannot use both 3843greylisting implementations in the same program. 3844 3845 3846File: mailfromd.info, Node: miltermacros, Next: provide-callout, Prev: greylist, Up: Pragmas 3847 38484.2.6 Pragma miltermacros 3849------------------------- 3850 3851 -- pragma: miltermacros handler macro ... 3852 Declare that the Milter stage HANDLER uses MTA macro listed as the 3853 rest of arguments. The HANDLER must be a valid handler name (*note 3854 Handlers::). 3855 3856 The 'mailfromd' parser collects the names of the macros referred to 3857by a '$NAME' construct within a handler (*note Sendmail Macros::) and 3858declares them automatically for corresponding handlers. It is, however, 3859unable to track macros used in functions called from handler as well as 3860those referred to via 'getmacro' and 'macro_defined' functions. Such 3861macros should be declared using '#pragma miltermacros'. 3862 3863 During initial negotiation with the MTA, 'mailfromd' will ask it to 3864export the macro names declared automatically or by using the '#pragma 3865miltermacros'. The MTA is free to honor or to ignore this request. In 3866particular, Sendmail versions prior to 8.14.0 and Postfix versions prior 3867to 2.5 do not support this feature. If you use one of these, you will 3868need to export the needed macros explicitly in the MTA configuration. 3869For more details, refer to the section in *note MTA Configuration:: 3870corresponding to your MTA type. 3871 3872 3873File: mailfromd.info, Node: provide-callout, Prev: miltermacros, Up: Pragmas 3874 38754.2.7 Pragma provide-callout 3876---------------------------- 3877 3878The '#pragma provide-callout' statement is used in the 'callout' module 3879to inform 'mailfromd' that the module has been loaded. 3880 3881 Do not use this pragma. 3882 3883 3884File: mailfromd.info, Node: Data Types, Next: Numbers, Prev: Pragmas, Up: MFL 3885 38864.3 Data Types 3887============== 3888 3889The 'mailfromd' filter script language operates on entities of two 3890types: numeric and string. 3891 3892 The "numeric" type is represented internally as a signed long 3893integer. Depending on the machine architecture, its size can vary. For 3894example, on machines with Intel-based CPUs it is 32 bits long. 3895 3896 A "string" is a string of characters of arbitrary length. Strings 3897can contain any characters except ASCII NUL. 3898 3899 There is also a "generic pointer", which is designed to facilitate 3900certain operations. It appears only in 'body' handler. *Note body 3901handler::, for more information about it. 3902 3903 3904File: mailfromd.info, Node: Numbers, Next: Literals, Prev: Data Types, Up: MFL 3905 39064.4 Numbers 3907=========== 3908 3909A "decimal number" is any sequence of decimal digits, not beginning with 3910'0'. 3911 3912 An "octal number" is '0' followed by any number of octal digits ('0' 3913through '7'), for example: '0340'. 3914 3915 A "hex number" is '0x' or '0X' followed by any number of hex digits 3916('0' through '9' and 'a' through 'f' or 'A' through 'F'), for example: 3917'0x3ef1'. 3918 3919 3920File: mailfromd.info, Node: Literals, Next: Here Documents, Prev: Numbers, Up: MFL 3921 39224.5 Literals 3923============ 3924 3925A literal is any sequence of characters enclosed in single or double 3926quotes. 3927 3928 After 'tempfail' and 'reject' actions two special kinds of literals 3929are recognized: three-digit numeric values represent RFC 2821 reply 3930codes, and literals consisting of tree digit groups separated by dots 3931represent an extended reply code as per RFC 1893/2034. For example: 3932 3933 510 # A reply code 3934 5.7.1 # An extended reply code 3935 3936Double-quoted strings 3937--------------------- 3938 3939String literals enclosed in double quotation marks ("double-quoted 3940strings") are subject to "backslash interpretation", "macro expansion", 3941"variable interpretation" and "back reference interpretation". 3942 3943 "Backslash interpretation" is performed at compilation time. It 3944consists in replacing the following "escape sequences" with the 3945corresponding single characters: 3946 3947Sequence Replaced with 3948\a Audible bell character (ASCII 7) 3949\b Backspace character (ASCII 8) 3950\f Form-feed character (ASCII 12) 3951\n Newline character (ASCII 10) 3952\r Carriage return character (ASCII 3953 13) 3954\t Horizontal tabulation character 3955 (ASCII 9) 3956\v Vertical tabulation character 3957 (ASCII 11) 3958 3959Table 4.2: Backslash escapes 3960 3961 In addition, the sequence '\NEWLINE' has the same effect as '\n', for 3962example: 3963 3964 "a string with\ 3965 embedded newline" 3966 "a string with\n embedded newline" 3967 3968 Any escape sequence of the form '\xHH', where H denotes any hex digit 3969is replaced with the character whose ASCII value is HH. For example: 3970 3971 "\x61nother" => "another" 3972 3973 Similarly, an escape sequence of the form '\0OOO', where O is an 3974octal digit, is replaced with the character whose ASCII value is OOO. 3975 3976 Macro expansion and variable interpretation occur at run-time. 3977During these phases all Sendmail macros (*note Sendmail Macros::), 3978'mailfromd' variables (*note Variables::), and constants (*note 3979Constants::) referenced in the string are replaced by their actual 3980values. For example, if the Sendmail macro 'f' has the value 3981'postmaster@gnu.org.ua' and the variable 'last_ip' has the value 3982'127.0.0.1', then the string(1) 3983 3984 "$f last connected from %last_ip;" 3985 3986will be expanded to 3987 3988 "postmaster@gnu.org.ua last connected from 127.0.0.1;" 3989 3990 A "back reference" is a sequence '\D', where D is a decimal number. 3991It refers to the Dth parenthesized subexpression in the last 'matches' 3992statement(2). Any back reference occurring within a double-quoted 3993string is replaced by the value of the corresponding subexpression. 3994*Note Special comparisons::, for a detailed description of this process. 3995Back reference interpretation is performed at run time. 3996 3997Single-quoted strings 3998--------------------- 3999 4000Any characters enclosed in single quotation marks are read unmodified. 4001 4002 The following examples contain pairs of equivalent strings: 4003 4004 "a string" 4005 'a string' 4006 4007 "\\(.*\\):" 4008 '\(.*\):' 4009 4010 Notice the last example. Single quotes are particularly useful in 4011writing regular expressions (*note Special comparisons::). 4012 4013 ---------- Footnotes ---------- 4014 4015 (1) Implementation note: actually, the references are not interpreted 4016within the string, instead, each such string is split at compilation 4017time into a series of concatenated atoms. Thus, our sample string will 4018actually be compiled as: 4019 4020 $f . " last connected from " . last_ip . ";" 4021 4022 *Note Concatenation::, for a description of this construct. You can 4023easily see how various strings are interpreted by using '--dump-tree' 4024option (*note --dump-tree::). In this case, it will produce: 4025 4026 CONCAT: 4027 CONCAT: 4028 CONCAT: 4029 SYMBOL: f 4030 CONSTANT: " last connected from " 4031 VARIABLE last_ip (13) 4032 CONSTANT: ";" 4033 4034 (2) The subexpressions are numbered by the positions of their opening 4035parentheses, left to right. 4036 4037 4038File: mailfromd.info, Node: Here Documents, Next: Sendmail Macros, Prev: Literals, Up: MFL 4039 40404.6 Here Documents 4041================== 4042 4043"Here-document" is a special form of a string literal is, allowing to 4044specify multiline strings without having to use backslash escapes. The 4045format of here-documents is: 4046 4047 <<[FLAGS]WORD 4048 ... 4049 WORD 4050 4051 The '<<WORD' construct instructs the parser to read all the following 4052lines up to the line containing only WORD, with possible trailing 4053blanks. The lines thus read are concatenated together into a single 4054string. For example: 4055 4056 set str <<EOT 4057 A multiline 4058 string 4059 EOT 4060 4061 The body of a here-document is interpreted the same way as 4062double-quoted strings (*note Double-quoted strings::). For example, if 4063Sendmail macro 'f' has the value 'jsmith@some.com' and the variable 4064'count' is set to '10', then the following string: 4065 4066 set s <<EOT 4067 <$f> has tried to send %count mails. 4068 Please see docs for more info. 4069 EOT 4070 4071will be expanded to: 4072 4073 <jsmith@some.com> has tried to send 10 mails. 4074 Please see docs for more info. 4075 4076 If the WORD is quoted, either by enclosing it in single quote 4077characters or by prepending it with a backslash, all interpretations and 4078expansions within the document body are suppressed. For example: 4079 4080 set s <<'EOT' 4081 The following line is read verbatim: 4082 <$f> has tried to send %count mails. 4083 Please see docs for more info. 4084 EOT 4085 4086 Optional FLAGS in the here-document construct control the way leading 4087white space is handled. If FLAGS is '-' (a dash), then all leading tab 4088characters are stripped from input lines and the line containing WORD. 4089Furthermore, if '-' is followed by a single space, all leading 4090whitespace is stripped from them. This allows here-documents within 4091configuration scripts to be indented in a natural fashion. Examples: 4092 4093 <<- TEXT 4094 <$f> has tried to send %count mails. 4095 Please see docs for more info. 4096 TEXT 4097 4098 Here-documents are particularly useful with 'reject' actions (*note 4099reject::. 4100 4101 4102File: mailfromd.info, Node: Sendmail Macros, Next: Constants, Prev: Here Documents, Up: MFL 4103 41044.7 Sendmail Macros 4105=================== 4106 4107Sendmail macros are referenced exactly the same way they are in 4108'sendmail.cf' configuration file, i.e. '$NAME', where NAME represents 4109the macro name. Notice, that the notation is the same for both 4110single-character and multi-character macro names. For consistency with 4111the 'Sendmail' configuration the '${NAME}' notation is also accepted. 4112 4113 Another way to reference Sendmail macros is by using function 4114'getmacro' (*note Macro access::). 4115 4116 Sendmail macros evaluate to string values. 4117 4118 Notice, that to reference a macro, you must properly export it in 4119your MTA configuration. Attempt to reference a not exported macro will 4120result in raising a 'e_macroundef' exception at the run time (*note 4121uncaught exceptions::). 4122 4123 4124File: mailfromd.info, Node: Constants, Next: Variables, Prev: Sendmail Macros, Up: MFL 4125 41264.8 Constants 4127============= 4128 4129A "constant" is a symbolic name for an MFL value. Constants are defined 4130using 'const' statement: 4131 4132 [QUALIFIER] const NAME EXPR 4133 4134where NAME is an identifier, and EXPR is any valid MFL expression 4135evaluating immediately to a constant literal or numeric value. Optional 4136QUALIFIER defines the scope of visibility for that constant (*note scope 4137of visibility::): either 'public' or 'static'. 4138 4139 Once defined, any appearance of NAME in the program text is replaced 4140by its value. For example: 4141 4142 const x 10/5 4143 const text "X is " 4144 4145defines the numeric constant 'x' with the value '5', and the literal 4146constant 'text' with the value 'X is '. 4147 4148 A special construct is provided to define a series of numeric 4149constants (an "enumeration"): 4150 4151 [QUALIFIER] const 4152 do 4153 NAME0 [EXPR0] 4154 NAME1 [EXPR1] 4155 ... 4156 NAMEN [EXPRN] 4157 done 4158 4159Each EXPRN, if present, must evaluate to a constant numeric expression. 4160The resulting value will be assigned to constant NAMEN. If EXPRN is not 4161supplied, the constant will be defined to the value of the previons 4162constant plus one. If EXPR0 is not supplied, 0 is assumed. 4163 4164 For example, consider the following statement 4165 4166 const 4167 do 4168 A 4169 B 4170 C 10 4171 D 4172 done 4173 4174This defines 'A' to 0, 'B' to 1, 'C' to 10 and 'D' to 11. 4175 4176 As a matter of fact, EXPRN may also evaluate to a constant string 4177expression, provided that all expressions in the enumeration 'const' 4178statement are provided. That is, the following is correct: 4179 4180 const 4181 do 4182 A "one" 4183 B "two" 4184 C "three" 4185 D "four" 4186 done 4187 4188whereas the following is not: 4189 4190 const 4191 do 4192 A "one" 4193 B 4194 C "three" 4195 D "four" 4196 done 4197 4198 Trying to compile the latter example will produce: 4199 4200 mailfromd: FILENAME:5.3: initializer element is not numeric 4201 4202which means that 'mailfromd' was trying to create constant 'B' with the 4203value of 'A' incremented by one, but was unable to do so, because the 4204value in question was not numeric. 4205 4206 Constants can be used in normal MFL expressions as well as in 4207literals. To expand a constant within a literal string, prepend a 4208percent sign to its name, e.g.: 4209 4210 echo "New %text %x" => "New X is 2" 4211 4212 This way of expanding constants creates an ambiguity if there happen 4213to be a variable of the same name as the constant. *Note 4214variable--constant clashes::, for more information of this case and ways 4215to handle it. 4216 4217* Menu: 4218 4219* Built-in constants:: 4220 4221 4222File: mailfromd.info, Node: Built-in constants, Up: Constants 4223 42244.8.1 Built-in constants 4225------------------------ 4226 4227Several constants are built into the MFL compiler. To discern them from 4228user-defined ones, their names start and end with two underscores 4229('__'). 4230 4231 The following constants are defined in 'mailfromd' version 8.10: 4232 4233 -- Built-in constant: string __file__ 4234 Expands to the name of the current source file. 4235 4236 -- Built-in constant: string __function__ 4237 Expands to the name of the current lexical context, i.e. the 4238 function or handler name. 4239 4240 -- Built-in constant: string __git__ 4241 This built-in constant is defined for alpha versions only. Its 4242 value is the Git tag of the recent commit corresponding to that 4243 version of the package. If the release contains some uncommitted 4244 changes, the value of the '__git__' constant ends with the suffix 4245 '-dirty'. 4246 4247 -- Built-in constant: number __line__ 4248 Expands to the current line number in the input source file. 4249 4250 -- Built-in constant: number __major__ 4251 Expands to the major version number. 4252 4253 The following example uses '__major__' constant to determine if 4254 some version-dependent feature can be used: 4255 4256 if __major__ > 2 4257 # Use some version-specific feature 4258 fi 4259 4260 -- Built-in constant: number __minor__ 4261 Expands to the minor version number. 4262 4263 -- Built-in constant: string __module__ 4264 Expands to the name of the current module (*note Modules::). 4265 4266 -- Built-in constant: string __package__ 4267 Expands to the package name ('mailfromd') 4268 4269 -- Built-in constant: number __patch__ 4270 For alpha versions and maintenance releases expands to the version 4271 patch level. For stable versions, expands to '0'. 4272 4273 -- Built-in constant: string __defpreproc__ 4274 Expands to the default external preprocessor command line, if the 4275 preprocessor is used, or to an empty string if it is not, e.g.: 4276 4277 __defpreproc__ => "/usr/bin/m4 -s" 4278 4279 *Note Preprocessor::, for information on preprocessor and its 4280 features. 4281 4282 -- Built-in constant: string __preproc__ 4283 Expands to the current external preprocessor command line, if the 4284 preprocessor is used, or to an empty string if it is not. Notice, 4285 that it equals '__defpreproc__', unless the preprocessor was 4286 redefined using '--preprocessor' command line option (*note 4287 -preprocessor: Preprocessor.). 4288 4289 -- Built-in constant: string __version__ 4290 Expands to the textual representation of the program version (e.g. 4291 '3.0.90') 4292 4293 -- Built-in constant: string __defstatedir__ 4294 Expands to the default state directory (*note statedir::). 4295 4296 -- Built-in constant: string __statedir__ 4297 Expands to the current value of the program state directory (*note 4298 statedir::). Notice, that it is the same as '__defstatedir__' 4299 unless the state directory was redefined at run time. 4300 4301 Built-in constants can be used as variables, this allows to expand 4302them within strings or here-documents. The following example 4303illustrates the common practice used for debugging configuration 4304scripts: 4305 4306 func foo(number x) 4307 do 4308 echo "%__file__:%__line__: foo called with arg %x" 4309 ... 4310 done 4311 4312 If the function 'foo' were called in line 28 of the script file 4313'/etc/mailfromd.mf', like this: 'foo(10)', you will see the following 4314string in your logs: 4315 4316 /etc/mailfromd.mf:28: foo called with arg 10 4317 4318 4319File: mailfromd.info, Node: Variables, Next: Back references, Prev: Constants, Up: MFL 4320 43214.9 Variables 4322============= 4323 4324Variables represent regions of memory used to hold variable data. These 4325memory regions are identified by "variable names". A variable name must 4326begin with a letter or underscore and must consist of letters, digits 4327and underscores. 4328 4329 Each variable is associated with its "scope of visibility", which 4330defines the part of source code where it can be used (*note scope of 4331visibility::). Depending on the scope, we discern three main classes of 4332variables: public, static and automatic (or local). 4333 4334 "Public variables" have indefinite lexical scope, so they may be 4335referred to anywhere in the program. "Static" are variables visible 4336only within their module (*note Modules::). "Automatic" or "local 4337variables" are visible only within the given function or handler. 4338 4339 Public and static variables are sometimes collectively called 4340"global". 4341 4342 These variable classes occupy separate "namespaces", so that an 4343automatic variable can have the same name as an existing public or 4344static one. In this case this variable is said to "shadow" its global 4345counterpart. All references to such a name will refer to the automatic 4346variable until the end of its scope is reached, where the global one 4347becomes visible again. 4348 4349 Likewise, a static variable may have the same name as a static 4350variable defined in another module. However, it may not have the same 4351name as a public variable. 4352 4353 A variable is "declared" using the following syntax: 4354 4355 [QUALIFIERS] TYPE NAME 4356 4357where NAME is the variable name, TYPE is the type of the data it is 4358supposed to hold. It is 'string' for string variables and 'number' for 4359numeric ones. 4360 4361 For example, this is a declaration of a string variable 'var': 4362 4363 string var 4364 4365 If a variable declaration occurs within a function (*note 4366User-defined: Functions.) or handler (*note Handlers::), it declares an 4367automatic variable, local to this function or handler. Otherwise, it 4368declares a global variable. 4369 4370 Optional QUALIFIERS are allowed only in global declarations, i.e. in 4371the variable declarations that appear outside of functions. They 4372specify the scope of the variable. The 'public' qualifier declares the 4373variable as public and the 'static' qualifier declares it as static. 4374The default scope is 'public', unless specified otherwise in the module 4375declaration (*note module structure::). 4376 4377 Additionally, QUALIFIERS may contain the word 'precious', which 4378instructs the compiler to mark this variable as "precious". (*note 4379precious variables: rset.). The value of the precious variable is not 4380affected by the SMTP 'RSET' command. If both scope qualifier and 4381'precious' are used, they may appear in any order, e.g.: 4382 4383 static precious string rcpt_list 4384 4385or 4386 4387 precious static string rcpt_list 4388 4389 Declaration can be followed by any valid MFL expression, which 4390supplies the initial value or "initializer" for the variable, for 4391example: 4392 4393 string var "test" 4394 4395 A global variable declared without initializer is implicitly 4396initialized to a null value: numeric variables assume initial value 0, 4397string variable are initialized to empty string. 4398 4399 The value of an automatic variable declared without initializer is 4400unspecified. It is an error to use such variable prior to assigning it 4401a value. 4402 4403 A variable is assigned a value using the 'set' statement: 4404 4405 set NAME EXPR 4406 4407where NAME is the variable name and EXPR is a 'mailfromd' expression 4408(*note Expressions::). The effect of this statement is that the EXPR is 4409evaluated and the value it yields is assigned to the variable NAME. 4410 4411 If the 'set' statement is located outside a function or handler 4412definition, the EXPR must be a constant expression, i.e. the compiler 4413should be able to evaluate it immediately. See optimizer. 4414 4415 It is not an error to assign a value to a variable that is not 4416declared. In this case the assignment first declares a global or 4417automatic variable having the type of EXPR and then assigns a value to 4418it. Automatic variable is created if the assignment occurs within a 4419function or handler, global variable is declared if it occurs at topmost 4420lexical level. This is called "implicit variable declaration". 4421 4422 In the MFL program, variables are referenced by their name. When 4423appearing inside a double-quoted string, variables are referenced using 4424the notation '%NAME'. Any variable being referenced must have been 4425declared earlier (either explicitly or implicitly). 4426 4427* Menu: 4428 4429* Predefined variables:: 4430 4431 4432File: mailfromd.info, Node: Predefined variables, Up: Variables 4433 44344.9.1 Predefined Variables 4435-------------------------- 4436 4437Several variables are predefined. In 'mailfromd' version 8.10 these 4438are: 4439 4440 -- Variable: Predefined Variable number cache_used 4441 This variable is set by 'stdpoll' and 'strictpoll' built-ins (and, 4442 consequently, by the 'on poll' statement). Its value is '1' if the 4443 function used the cached data instead of directly polling the host, 4444 and '0' if the polling took place. *Note SMTP Callout functions::. 4445 4446 You can use this variable to make your reject message more 4447 informative for the remote party. The common paradigm is to define 4448 a function, returning empty string if the result was obtained from 4449 polling, or some notice if cached data were used, and to use the 4450 function in the 'reject' text, for example: 4451 4452 func cachestr() returns string 4453 do 4454 if cache_used 4455 return "[CACHED] " 4456 else 4457 return "" 4458 fi 4459 done 4460 4461 Then, in 'prog envfrom' one can use: 4462 4463 on poll $f 4464 do 4465 when not_found or failure: 4466 reject 550 5.1.0 cachestr() . "Sender validity not confirmed" 4467 done 4468 4469 -- Predefined Variable: string clamav_virus_name 4470 Name of virus identified by 'ClamAV'. Set by 'clamav' function 4471 (*note ClamAV::). 4472 4473 -- Predefined Variable: number greylist_seconds_left 4474 Number of seconds left to the end of greylisting period. Set by 4475 'greylist' and 'is_greylisted' functions (*note Special test 4476 functions::). 4477 4478 -- Predefined Variable: string ehlo_domain 4479 Name of the domain used by polling functions in SMTP 'EHLO' or 4480 'HELO' command. Default value is the fully qualified domain name 4481 of the host where 'mailfromd' is run. *Note Polling::. 4482 4483 -- Variable: Predefined Variable string last_poll_greeting 4484 Callout functions (*note SMTP Callout functions::) set this 4485 variable before returning. It contains the initial SMTP reply from 4486 the last polled host. 4487 4488 -- Variable: Predefined Variable string last_poll_helo 4489 Callout functions (*note SMTP Callout functions::) set this 4490 variable before returning. It contains the reply to the 'HELO' 4491 ('EHLO') command, received from the last polled host. 4492 4493 -- Variable: Predefined Variable string last_poll_host 4494 Callout functions (*note SMTP Callout functions::) set this 4495 variable before returning. It contains the host name or IP address 4496 of the last polled host. 4497 4498 -- Variable: Predefined Variable string last_poll_recv 4499 Callout functions (*note SMTP Callout functions::) set this 4500 variable before returning. It contains the last SMTP reply 4501 received from the remote host. In case of multi-line replies, only 4502 the first line is stored. If nothing was received the variable 4503 contains the string 'nothing'. 4504 4505 -- Variable: Predefined Variable string last_poll_sent 4506 Callout functions (*note SMTP Callout functions::) set this 4507 variable before returning. It contains the last SMTP command sent 4508 to the polled host. If nothing was sent, 'last_poll_sent' contains 4509 the string 'nothing'. 4510 4511 -- Predefined Variable: string mailfrom_address 4512 Email address used by polling functions in SMTP 'MAIL FROM' command 4513 (*note Polling::.). Default is '<>'. Here is an example of how to 4514 change it: 4515 4516 set mailfrom_address "postmaster@my.domain.com" 4517 4518 You can set this value to a comma-separated list of email 4519 addresses, in which case the probing will try each address until 4520 either the remote party accepts it or the list of addresses is 4521 exhausted, whichever happens first. 4522 4523 It is not necessary to enclose emails in angle brackets, as they 4524 will be added automatically where appropriate. The only exception 4525 is null return address, when used in a list of addresses. In this 4526 case, it should always be written as '<>'. For example: 4527 4528 set mailfrom_address "postmaster@my.domain.com, <>" 4529 4530 -- Predefined Variable: number sa_code 4531 Spam score for the message, set by 'sa' function (*note sa::). 4532 4533 -- Predefined Variable: number rcpt_count 4534 The variable 'rcpt_count' keeps the number of recipients given so 4535 far by 'RCPT TO' commands. It is defined only in 'envrcpt' 4536 handlers. 4537 4538 -- Predefined Variable: number sa_threshold 4539 Spam threshold, set by 'sa' function (*note sa::). 4540 4541 -- Predefined Variable: string sa_keywords 4542 Spam keywords for the message, set by 'sa' function (*note sa::). 4543 4544 -- Predefined Variable: number safedb_verbose 4545 This variable controls the verbosity of the exception-safe database 4546 functions. *Note safedb_verbose::. 4547 4548 4549File: mailfromd.info, Node: Back references, Next: Handlers, Prev: Variables, Up: MFL 4550 45514.10 Back references 4552==================== 4553 4554A "back reference" is a sequence '\D', where D is a decimal number. It 4555refers to the Dth parenthesized subexpression in the last 'matches' 4556statement(1). Any back reference occurring within a double-quoted 4557string is replaced with the value of the corresponding subexpression. 4558For example: 4559 4560 if $f matches '.*@\(.*\)\.gnu\.org\.ua' 4561 set host \1 4562 fi 4563 4564 If the value of 'f' macro is 'smith@unza.gnu.org.ua', the above code 4565will assign the string 'unza' to the variable 'host'. 4566 4567 Notice, that each occurrence of 'matches' will reset the table of 4568back references, so try to use them as early as possible. The following 4569example illustrates a common error, when the back reference is used 4570after the reference table has been reused by another matching: 4571 4572 # Wrong! 4573 if $f matches '.*@\(.*\)\.gnu\.org\.ua' 4574 if $f matches 'some.*' 4575 set host \1 4576 fi 4577 fi 4578 4579 This will produce the following run time error: 4580 4581 mailfromd: RUNTIME ERROR near file.mf:3: Invalid back-reference number 4582 4583because the inner match ('some.*') does not have any parenthesized 4584subexpressions. 4585 4586 *Note Special comparisons::, for more information about 'matches' 4587operator. 4588 4589 ---------- Footnotes ---------- 4590 4591 (1) The subexpressions are numbered by the positions of their opening 4592parentheses, left to right. 4593 4594 4595File: mailfromd.info, Node: Handlers, Next: begin/end, Prev: Back references, Up: MFL 4596 45974.11 Handlers 4598============= 4599 4600"Milter stage handler" (or "handler", for short) is a subroutine 4601responsible for processing a particular milter state. There are eight 4602handlers available. Their order of invocation and arguments are 4603described in *note Figure 3.1: milter-control-flow. 4604 4605 A handler is defined using the following construct: 4606 4607 prog HANDLER-NAME 4608 do 4609 HANDLER-BODY 4610 done 4611 4612where HANDLER-NAME is the name of the handler (*note handler names::), 4613HANDLER-BODY is the list of filter statements composing the handler 4614body. Some handlers take arguments, which can be accessed within the 4615HANDLER-BODY using the notation $N, where N is the ordinal number of the 4616argument. Here we describe the available handlers and their arguments: 4617 4618 -- Handler: connect (string $1, number $2, number $3, string $4) 4619 Invocation: 4620 This handler is called once at the beginning of each SMTP 4621 connection. 4622 4623 Arguments: 4624 1. 'string'; The host name of the message sender, as 4625 reported by MTA. Usually it is determined by a reverse 4626 lookup on the host address. If the reverse lookup fails, 4627 '$1' will contain the message sender's IP address 4628 enclosed in square brackets (e.g. '[127.0.0.1]'). 4629 4630 2. 'number'; Socket address family. You need to require the 4631 'status' module to get symbolic definitions for the 4632 address families. Supported families are: 4633 4634 Constant Value Meaning 4635 ------------------------------------------------------------ 4636 FAMILY_STDIO 0 Standard input/output (the MTA 4637 is run with '-bs' option) 4638 FAMILY_UNIX 1 UNIX socket 4639 FAMILY_INET 2 IPv4 protocol 4640 FAMILY_INET6 3 IPv6 protocol 4641 4642 Table 4.3: Supported socket families 4643 4644 3. 'number'; Port number if '$2' is 'FAMILY_INET'. 4645 4646 4. 'string'; Remote IP address if '$2' is 'FAMILY_INET' or 4647 full file name of the socket if '$2' is 'FAMILY_UNIX'. 4648 If '$2' is 'FAMILY_STDIO', '$4' is an empty string. 4649 4650 The actions (*note Actions::) appearing in this handler are handled 4651 by Sendmail in a special way. First of all, any textual message is 4652 ignored. Secondly, the only action that immediately closes the 4653 connection is 'tempfail 421'. Any other reply codes result in 4654 Sendmail switching to "nullserver" mode, where it accepts any 4655 commands, but answers with a failure to any of them, except for the 4656 following: 'QUIT', 'HELO', 'NOOP', which are processed as usual. 4657 4658 The following table summarizes the Sendmail behavior depending on 4659 the action used: 4660 4661 'tempfail 421 EXCODE MESSAGE' 4662 The caller is returned the following error message: 4663 4664 421 4.7.0 HOSTNAME closing connection 4665 4666 Both EXCODE and MESSAGE are ignored. 4667 4668 'tempfail 4XX EXCODE MESSAGE' 4669 (where XX represents any digits, except '21') Both EXCODE and 4670 MESSAGE are ignored. Sendmail switches to nullserver mode. 4671 Any subsequent command, excepting the ones listed above, is 4672 answered with 4673 4674 454 4.3.0 Please try again later 4675 4676 'reject 5XX EXCODE MESSAGE' 4677 (where XX represents any digits). All arguments are ignored. 4678 Sendmail switches to nullserver mode. Any subsequent command, 4679 excepting ones listed above, is answered with 4680 4681 550 5.0.0 Command rejected 4682 4683 Regarding reply codes, this behavior complies with RFC 2821 4684 (section 3.9), which states: 4685 4686 An SMTP server _must not_ intentionally close the connection 4687 except: 4688 [...] 4689 - After detecting the need to shut down the SMTP service and 4690 returning a 421 response code. This response code can be 4691 issued after the server receives any command or, if necessary, 4692 asynchronously from command receipt (on the assumption that 4693 the client will receive it after the next command is issued). 4694 4695 However, the RFC says nothing about textual messages and extended 4696 error codes, therefore Sendmail's ignoring of these is, in my 4697 opinion, absurd. My practice shows that it is often reasonable, 4698 and even necessary, to return a meaningful textual message if the 4699 initial connection is declined. The opinion of 'mailfromd' users 4700 seems to support this view. Bearing this in mind, 'mailfromd' is 4701 shipped with a patch for Sendmail, which makes it honor both 4702 extended return code and textual message given with the action. 4703 Two versions are provided: 'etc/sendmail-8.13.7.connect.diff', for 4704 Sendmail versions 8.13.x, and 'etc/sendmail-8.14.3.connect.diff', 4705 for Sendmail versions 8.14.3. 4706 4707 -- Handler: helo (string $1) 4708 Invocation: 4709 This handler is called whenever the SMTP client sends 'HELO' 4710 or 'EHLO' command. Depending on the actual MTA configuration, 4711 it can be called several times or even not at all. 4712 4713 Arguments: 4714 1. 'string'; Argument to 'HELO' ('EHLO') commands. 4715 4716 Notes: 4717 According to RFC 28221, '$1' must be domain name of the 4718 sending host, or, in case this is not available, its IP 4719 address enclosed in square brackets. Be careful when taking 4720 decisions based on this value, because in practice many hosts 4721 send arbitrary strings. We recommend to use 'heloarg_test' 4722 function (*note heloarg_test::) if you wish to analyze this 4723 value. 4724 4725 -- Handler: envfrom (string $1, string $2) 4726 Invocation: 4727 Called when the SMTP client sends 'MAIL FROM' command, i.e. 4728 once at the beginning of each message. 4729 4730 Arguments: 4731 1. 'string'; First argument to the 'MAIL FROM' command, i.e. 4732 the email address of the sender. 4733 2. 'string'; Rest of arguments to 'MAIL FROM' separated by 4734 space character. This argument can be '""'. 4735 4736 Notes 4737 1. '$1' is not the same as '$f' Sendmail variable, because 4738 the latter contains the sender email after address 4739 rewriting and normalization, while '$1' contains exactly 4740 the value given by sending party. 4741 4742 2. When the array type is implemented, '$2' will contain an 4743 array of arguments. 4744 4745 -- Handler: envrcpt (string $1, string $2) 4746 Invocation: 4747 Called once for each 'RCPT TO' command, i.e. once for each 4748 recipient, immediately after 'envfrom'. 4749 Arguments: 4750 1. 'string'; First argument to the 'RCPT TO' command, i.e. 4751 the email address of the recipient. 4752 2. 'string'; Rest of arguments to 'RCPT TO' separated by 4753 space character. This argument can be '""'. 4754 4755 Notes: 4756 When the array type is implemented, '$2' will contain an array 4757 of arguments. 4758 4759 -- Handler: data () 4760 Invocation: 4761 Called after the MTA receives SMTP 'DATA' command. Notice 4762 that this handler is not supported by Sendmail versions prior 4763 to 8.14.0 and Postfix versions prior to 2.5. 4764 Arguments: 4765 None 4766 4767 -- Handler: header (string $1, string $2) 4768 Invocation: 4769 Called once for each header line received after SMTP 'DATA' 4770 command. 4771 Arguments: 4772 1. 'string'; Header field name. 4773 2. 'string'; Header field value. The content of the header 4774 may include folded white space, i.e., multiple lines with 4775 following white space where lines are separated by LF 4776 (ASCII 10). The trailing line terminator (CR/LF) is 4777 removed. 4778 4779 -- Handler: eoh 4780 Invocation: 4781 This handler is called once per message, after all headers 4782 have been sent and processed. 4783 Arguments: 4784 None. 4785 4786 -- Handler: body (pointer $1, number $2) 4787 Invocation: 4788 This header is called zero or more times, for each piece of 4789 the message body obtained from the remote host. 4790 Arguments: 4791 1. 'pointer'; Piece of body text. See 'Notes' below. 4792 2. 'number'; Length of data pointed to by '$1', in bytes. 4793 Notes: 4794 The first argument points to the body chunk. Its size may be 4795 quite considerable and passing it as a string may be costly 4796 both in terms of memory and execution time. For this reason 4797 it is not passed as a string, but rather as a "generic 4798 pointer", i.e. an object having the same size as 'number', 4799 which can be used to retrieve the actual contents of the body 4800 chunk if the need arises. 4801 4802 A special function 'body_string' is provided to convert this 4803 object to a regular MFL string (*note Mail body functions::). 4804 Using it you can collect the entire body text into a single 4805 global variable, as illustrated by the following example: 4806 4807 string text 4808 4809 prog body 4810 do 4811 set text text . body_string($1,$2) 4812 done 4813 4814 The text collected this way can then be used in the 'eom' handler 4815(see below) to parse and analyze it. 4816 4817 If you wish to analyze both the headers and mail body, the following 4818code fragment will do that for you: 4819 4820 string text 4821 4822 # Collect all headers. 4823 prog header 4824 do 4825 set text text . $1 . ": " . $2 . "\n" 4826 done 4827 4828 # Append terminating newline to the headers. 4829 prog eoh 4830 do 4831 set text "%text\n" 4832 done 4833 4834 # Collect message body. 4835 prog body 4836 do 4837 set text text . body_string($1, $2) 4838 done 4839 4840 -- Handler: eom 4841 Invocation: 4842 This handler is called once per message, when the terminating 4843 dot after 'DATA' command has been received. 4844 Arguments: 4845 None 4846 Notes: 4847 This handler is useful for calling "message capturing" 4848 functions, such as 'sa' or 'clamav'. For more information 4849 about these, refer to *note Interfaces to Third-Party 4850 Programs::. 4851 4852 For your reference, the following table shows each handler with its 4853arguments: 4854 4855Handler $1 $2 $3 $4 4856--------------------------------------------------------------------------- 4857connect Hostname Socket Port Remote 4858 Family address 4859helo 'HELO' N/A N/A N/A 4860 domain 4861envfrom Sender email Rest of N/A N/A 4862 address arguments 4863envrcpt Recipient Rest of N/A N/A 4864 email arguments 4865 address 4866header Header name Header value N/A N/A 4867eoh N/A N/A N/A N/A 4868body Body segment Length of N/A N/A 4869 (pointer) the segment 4870 (numeric) 4871eom N/A N/A N/A N/A 4872 4873Table 4.4: State Handler Arguments 4874 4875 4876File: mailfromd.info, Node: begin/end, Next: Functions, Prev: Handlers, Up: MFL 4877 48784.12 The 'begin' and 'end' special handlers 4879=========================================== 4880 4881Apart from the milter handlers described in the previous section, MFL 4882defines two special handlers, called 'begin' and 'end', which supply 4883startup and cleanup instructions for the filter program. 4884 4885 The 'begin' special handler is executed once for each SMTP session, 4886after the connection has been established but before the first milter 4887handler has been called. Similarly, the 'end' handler is executed 4888exactly once, after the connection has been closed. Neither of them 4889takes any arguments. 4890 4891 The two handlers are defined using the following syntax: 4892 4893 # Begin handler 4894 begin 4895 do 4896 ... 4897 done 4898 4899 # End handler 4900 end 4901 do 4902 ... 4903 done 4904 4905where '...' represent any MFL statements. 4906 4907 An MFL program may have multiple 'begin' and 'end' definitions. They 4908can be intermixed with other definitions. The compiler combines all 4909'begin' statements into a single one, in the order they appear in the 4910sources. Similarly, all 'end' blocks are concatenated together. The 4911resulting 'begin' is called once, at the beginning of each SMTP session, 4912and 'end' is called once at its termination. 4913 4914 Multiple 'begin' and 'end' handlers are a useful feature for writing 4915modules (*note Modules::), because each module can thus have its own 4916initialization and cleanup blocks. Notice, however, that in this case 4917the order in which subsequent 'begin' and 'end' blocks are executed is 4918not defined. It is only warranted that all 'begin' blocks are executed 4919at startup and all 'end' blocks are executed at shutdown. It is also 4920warranted that all 'begin' and 'end' blocks defined within a compilation 4921unit (i.e. a single abstract source file, with all '#include' and 4922'#include_once' statements expanded in place) are executed in order of 4923their appearance in the unit. 4924 4925 Due to their special nature, the startup and cleanup blocks impose 4926certain restrictions on the statements that can be used within them: 4927 4928 1. 'return' cannot be used in 'begin' and 'end' handlers. 4929 4930 2. The following Sendmail actions cannot be used in them: 'accept', 4931 'continue', 'discard', 'reject', 'tempfail'. They can, however, be 4932 used in 'catch' statements, declared in 'begin' blocks (see example 4933 below). 4934 4935 3. Header manipulation actions (*note header manipulation::) cannot be 4936 used in 'end' handler. 4937 4938 The 'begin' handlers are the usual place to put global initialization 4939code to. For example, if you do not want to use DNS caching, you can do 4940it this way: 4941 4942 begin 4943 do 4944 db_set_active("dns", 0) 4945 done 4946 4947 Additionally, you can set up global exception handling routines 4948there. For example, the following 'begin' statement installs a handler 4949for all exceptions not handled otherwise that logs the exception along 4950with the stack trace and continues processing the message: 4951 4952 begin 4953 do 4954 catch * 4955 do 4956 echo "Caught exception $1: $2" 4957 stack_trace() 4958 continue 4959 done 4960 done 4961 4962 4963File: mailfromd.info, Node: Functions, Next: Expressions, Prev: begin/end, Up: MFL 4964 49654.13 Functions 4966============== 4967 4968A "function" is a named 'mailfromd' subroutine, which takes zero or more 4969"parameters" and optionally returns a certain value. Depending on the 4970return value, functions can be subdivided into "string functions" and 4971"number functions". A function may have "mandatory" and "optional 4972parameters". When invoked, the function must be supplied exactly as 4973many "actual arguments" as the number of its mandatory parameters. 4974 4975 Functions are invoked using the following syntax: 4976 4977 NAME (ARGS) 4978 4979where NAME is the function name and ARGS is a comma-separated list of 4980expressions. For example, the following are valid function calls: 4981 4982 foo(10) 4983 interval("1 hour") 4984 greylist("/var/my.db", 180) 4985 4986 The number of parameters a function takes and their data types 4987compose the "function signature". When actual arguments are passed to 4988the function, they are converted to types of the corresponding formal 4989parameters. 4990 4991 There are two major groups of functions: "built-in" functions, that 4992are implemented in the 'mailfromd' binary, and "user-defined" functions, 4993that are written in MFL. The invocation syntax is the same for both 4994groups. 4995 4996 'Mailfromd' is shipped with a rich set of "library functions". These 4997are described in *note Library::. In addition to these you can define 4998your own functions. 4999 5000 Function definitions can appear anywhere between the handler 5001declarations in a filter program, the only requirement being that the 5002function definition occur before the place where the function is 5003invoked. 5004 5005 The syntax of a function definition is: 5006 5007 [QUALIFIER] func NAME (PARAM-DECL) returns DATA-TYPE 5008 do 5009 FUNCTION-BODY 5010 done 5011 5012where NAME is the name of the function to define, PARAM-DECL is a 5013comma-separated list of parameter declarations. The syntax of the 5014latter is the same as that of variable declarations (*note Variable 5015declarations: Variables.), i.e.: 5016 5017 TYPE NAME 5018 5019declares the parameter NAME having the type TYPE. The TYPE is 'string' 5020or 'number'. 5021 5022 Optional QUALIFIER declares the scope of visibility for that function 5023(*note scope of visibility::). It is similar to that of variables, 5024except that functions cannot be local (i.e. you cannot declare function 5025within another function). 5026 5027 The 'public' qualifier declares a function that may be referred to 5028from any module, whereas the 'static' qualifier declares a function that 5029may be called only from the current module (*note Modules::). The 5030default scope is 'public', unless specified otherwise in the module 5031declaration (*note module structure::). 5032 5033 For example, the following declares a function 'sum', that takes two 5034numeric arguments and returns a numeric value: 5035 5036 func sum(number x, number y) returns number 5037 5038 Similarly, the following is a declaration of a static function: 5039 5040 static func sum(number x, number y) returns number 5041 5042 Parameters are referenced in the FUNCTION-BODY by their name, the 5043same way as other variables. Similarly, the value of a parameter can be 5044altered using 'set' statement. 5045 5046 A function can be declared to take a certain number of "optional 5047arguments". In a function declaration, optional abstract arguments must 5048be placed after the mandatory ones, and must be separated from them with 5049a semicolon. The following example is a definition of function 'foo', 5050which takes two mandatory and two optional arguments: 5051 5052 func foo(string msg, string email; number x, string pfx) 5053 5054Mandatory parameters are: 'msg' and 'email'. Optional parameters are: 5055'x' and 'pfx'. The actual number of arguments supplied to the function 5056is returned by a special construct '$#'. In addition, the special 5057construct '@ARG' evaluates to the ordinal number of variable ARG in the 5058list of formal parameters (the first argument has number '0'). These 5059two constructs can be used to verify whether an argument is supplied to 5060the function. 5061 5062 When an actual argument for parameter 'n' is supplied, the number of 5063actual arguments ('$#') is greater than the ordinal number of that 5064parameter in the declaration list ('@N'). Thus, the following construct 5065can be used to check if an optional argument ARG is actually supplied: 5066 5067 func foo(string msg, string email; number x, string arg) 5068 do 5069 if $# > @arg 5070 ... 5071 fi 5072 5073 The default 'mailfromd' installation provides a special macro for 5074this purpose: *note defined::. Using it, the example above could be 5075rewritten as: 5076 5077 func foo(string msg, string email; number x, string arg) 5078 do 5079 if defined(arg) 5080 ... 5081 fi 5082 5083 Within a function body, optional arguments are referenced exactly the 5084same way as the mandatory ones. Attempt to dereference an optional 5085argument for which no actual parameter was supplied, results in an 5086undefined value, so be sure to check whether a parameter is passed 5087before dereferencing it. 5088 5089 A function can also take variable number of arguments (such functions 5090are called "variadic"). This is indicated by the use of ellipsis as the 5091last abstract parameter. The statement below defines a function 'foo' 5092taking one mandatory, one optional and any number of additional 5093arguments: 5094 5095 func foo (string a ; string b, ...) 5096 5097 All actual arguments passed in a list of variable arguments are 5098coerced to string data type. To refer to these arguments in the 5099function body, the following construct is used: 5100 5101 $(EXPR) 5102 5103where EXPR is any valid MFL expression, evaluating to a number N. This 5104construct refers to the value of Nth actual parameter from the variable 5105argument list. Parameters are numbered from '1', so the first variable 5106parameter is '$(1)', and the last one is '$($# - NM - NO)', where NM and 5107NO are numbers of mandatory and optional parameters to the function. 5108 5109 For example, the function below prints all its arguments: 5110 5111 func pargs (string text, ...) 5112 do 5113 echo "text=%text" 5114 loop for number i 1, 5115 while i <= $# - 1, 5116 set i i + 1 5117 do 5118 echo "arg %i=" . $(i) 5119 done 5120 done 5121 5122Note the loop limits. The last variable argument has number '$# - 1', 5123because the function takes one mandatory argument. 5124 5125 The FUNCTION-BODY is any list of valid 'mailfromd' statements. In 5126addition to the statements discussed below (*note Statements::) it can 5127also contain the 'return' statement, which is used to return a value 5128from the function. The syntax of the return statement is 5129 5130 return VALUE 5131 5132 As an example of this, consider the following code snippet that 5133defines the function 'sum' to return a sum of its two arguments: 5134 5135 func sum(number x, number y) returns number 5136 do 5137 return x + y 5138 done 5139 5140 The 'returns' part in the function declaration is optional. A 5141declaration lacking it defines a "procedure", or "void function", i.e. 5142a function that is not supposed to return any value. Such functions 5143cannot be used in expressions, instead they are used as statements 5144(*note Statements::). The following example shows a function that emits 5145a customized temporary failure notice: 5146 5147 func stdtf() 5148 do 5149 tempfail 451 4.3.5 "Try again later" 5150 done 5151 5152 A function may have several names. An alternative name (or "alias") 5153can be assigned to a function by using 'alias' keyword, placed after 5154PARAM-DECL part, for example: 5155 5156 func foo() 5157 alias bar 5158 returns string 5159 do 5160 ... 5161 done 5162 5163 After this declaration, both 'foo()' and 'bar()' will refer to the 5164same function. 5165 5166 The number of function aliases is unlimited. The following fragment 5167declares a function having three names: 5168 5169 func foo() 5170 alias bar 5171 alias baz 5172 returns string 5173 do 5174 ... 5175 done 5176 5177 Although this feature is rarely needed, there are sometimes cases 5178when it may be necessary. 5179 5180 A variable declared within a function becomes a local variable to 5181this function. Its lexical scope ends with the terminating 'done' 5182statement. 5183 5184 Parameters, local variables and global variables are using separate 5185namespaces, so a parameter name can coincide with the name of a global, 5186in which case a parameter is said to "shadow" the global. All 5187references to its name will refer to the parameter, until the end of its 5188scope is reached, where the global one becomes visible again. Consider 5189the following example: 5190 5191 number x 5192 5193 func foo(string x) 5194 do 5195 echo "foo: %x" 5196 done 5197 5198 prog envfrom 5199 do 5200 set x "Global" 5201 foo("Local") 5202 echo x 5203 done 5204 5205Running 'mailfromd --test' with this configuration will display: 5206 5207 foo: Local 5208 Global 5209 5210* Menu: 5211 5212* Some Useful Functions:: 5213 5214 5215File: mailfromd.info, Node: Some Useful Functions, Up: Functions 5216 52174.13.1 Some Useful Functions 5218---------------------------- 5219 5220To illustrate the concept of user-defined functions, this subsection 5221shows the definitions of some of the library functions shipped with 5222'mailfromd'(1). These functions are contained in modules installed 5223along with the 'mailfromd' binary. To use any of them in your code, 5224require the appropriate module as described in *note import::, e.g. to 5225use the 'revip' function, do 'require 'revip''. 5226 5227 Functions and their definitions: 5228 5229 1. 'revip' 5230 5231 The function 'revip' (*note revip::) is implemented as follows: 5232 5233 func revip(string ip) returns string 5234 do 5235 return inet_ntoa(ntohl(inet_aton(ip))) 5236 done 5237 5238 Previously it was implemented using regular expressions. Below we 5239 include this variant as well, as an illustration for the use of 5240 regular expressions: 5241 5242 #pragma regex push +extended 5243 func revip(string ip) returns string 5244 do 5245 if ip matches '([0-9]+)\.([0-9]+)\.([0-9]+)\.([0-9]+)' 5246 return "\4.\3.\2.\1" 5247 fi 5248 return ip 5249 done 5250 #pragma regex pop 5251 5252 2. 'strip_domain_part' 5253 5254 This function returns at most N last components of the domain name 5255 DOMAIN (*note strip_domain_part::). 5256 5257 #pragma regex push +extended 5258 5259 func strip_domain_part(string domain, number n) returns string 5260 do 5261 if n > 0 and 5262 domain matches '.*((\.[^.]+){' . $2 . '})' 5263 return substring(\1, 1, -1) 5264 else 5265 return domain 5266 fi 5267 done 5268 #pragma regex pop 5269 5270 3. 'valid_domain' 5271 5272 *Note valid_domain::, for a description of this function. Its 5273 definition follows: 5274 5275 require dns 5276 5277 func valid_domain(string domain) returns number 5278 do 5279 return not (resolve(domain) = "0" and not hasmx(domain)) 5280 done 5281 5282 4. 'match_dnsbl' 5283 5284 The function 'match_dnsbl' (*note match_dnsbl::) is defined as 5285 follows: 5286 5287 require dns 5288 require match_cidr 5289 #pragma regex push +extended 5290 5291 func match_dnsbl(string address, string zone, string range) 5292 returns number 5293 do 5294 string rbl_ip 5295 if range = 'ANY' 5296 set rbl_ip '127.0.0.0/8' 5297 else 5298 set rbl_ip range 5299 if not range matches '^([0-9]{1,3}\.){3}[0-9]{1,3}$' 5300 return 0 5301 fi 5302 fi 5303 5304 if not (address matches '^([0-9]{1,3}\.){3}[0-9]{1,3}$' 5305 and address != range) 5306 return 0 5307 fi 5308 5309 if address matches 5310 '^([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})$' 5311 if match_cidr (resolve ("\4.\3.\2.\1", zone), rbl_ip) 5312 return 1 5313 else 5314 return 0 5315 fi 5316 fi 5317 # never reached 5318 done 5319 5320 ---------- Footnotes ---------- 5321 5322 (1) Notice that these are intended for educational purposes and do 5323not necessarily coincide with the actual definitions of these functions 5324in Mailfromd version 8.10. 5325 5326 5327File: mailfromd.info, Node: Expressions, Next: Shadowing, Prev: Functions, Up: MFL 5328 53294.14 Expressions 5330================ 5331 5332Expressions are language constructs, that evaluate to a value, that can 5333subsequently be echoed, tested in a conditional statement, assigned to a 5334variable or passed to a function. 5335 5336* Menu: 5337 5338* Constant expressions:: String and Numeric Constants. 5339* Function calls:: A Function Call is an Expression. 5340* Concatenation:: String Concatenation. 5341* Arithmetic operations:: '+', '-', etc. 5342* Bitwise shifts:: '<<' and '>>'. 5343* Relational expressions:: '=', '<', etc. 5344* Special comparisons:: 'matches', 'mx matches', etc. 5345* Boolean expressions:: 'and', 'or', 'not'. 5346* Precedence:: How various operators nest. 5347* Type casting:: 5348 5349 5350File: mailfromd.info, Node: Constant expressions, Next: Function calls, Up: Expressions 5351 53524.14.1 Constant Expressions 5353--------------------------- 5354 5355Literals and numbers are "constant expressions". They evaluate to 5356string and numeric types. 5357 5358 5359File: mailfromd.info, Node: Function calls, Next: Concatenation, Prev: Constant expressions, Up: Expressions 5360 53614.14.2 Function Calls 5362--------------------- 5363 5364A function call is an expression. Its type is the return type of the 5365function. 5366 5367 5368File: mailfromd.info, Node: Concatenation, Next: Arithmetic operations, Prev: Function calls, Up: Expressions 5369 53704.14.3 Concatenation 5371-------------------- 5372 5373Concatenation operator is '.' (a dot). For example, if '$f' is 'smith', 5374and '$client_addr' is '10.10.1.1', then: 5375 5376 $f . "-" . $client_addr => "smith-10.10.1.1" 5377 5378 Any two adjacent literal strings are concatenated, producing a new 5379string, e.g. 5380 5381 "GNU's" " not " "UNIX" => "GNU's not UNIX" 5382 5383 5384File: mailfromd.info, Node: Arithmetic operations, Next: Bitwise shifts, Prev: Concatenation, Up: Expressions 5385 53864.14.4 Arithmetic Operations 5387---------------------------- 5388 5389The filter script language offers the common arithmetic operators: '+', 5390'-', '*' and '/'. In addition, the '%' is a "modulo" operator, i.e. it 5391computes the remainder of division of its operands. 5392 5393 All of them follow usual precedence rules and work as you would 5394expect them to. 5395 5396 5397File: mailfromd.info, Node: Bitwise shifts, Next: Relational expressions, Prev: Arithmetic operations, Up: Expressions 5398 53994.14.5 Bitwise shifts 5400--------------------- 5401 5402The '<<' represents a "bitwise shift left" operation, which shifts the 5403binary representation of the operand on its left by the number of bits 5404given by the operand on its right. 5405 5406 Similarly, the '>>' represents a "bitwise shift right". 5407 5408 5409File: mailfromd.info, Node: Relational expressions, Next: Special comparisons, Prev: Bitwise shifts, Up: Expressions 5410 54114.14.6 Relational Expressions 5412----------------------------- 5413 5414Relational expressions are: 5415 5416Expression Result 5417-------------------------------------------------------------------------- 5418X '<' Y True if X is less than Y. 5419X '<=' Y True if X is less than or equal to Y. 5420X '>' Y True if X is greater than Y. 5421X '>=' Y True if X is greater than or equal to Y. 5422X '=' Y True if X is equal to Y. 5423X '!=' Y True if X is not equal to Y. 5424 5425Table 4.5: Relational Expressions 5426 5427 The relational expressions apply to string as well as to numbers. 5428When a relational operation applies to strings, case-sensitive 5429comparison is used, e.g.: 5430 5431 "String" = "string" => False 5432 "String" < "string" => True 5433 5434 5435File: mailfromd.info, Node: Special comparisons, Next: Boolean expressions, Prev: Relational expressions, Up: Expressions 5436 54374.14.7 Special Comparisons 5438-------------------------- 5439 5440In addition to the traditional relational operators, described above, 5441'mailfromd' provides two operators for regular expression matching: 5442 5443Expression Result 5444-------------------------------------------------------------------------- 5445X 'matches' Y True if the string X matches the regexp denoted by 5446 Y. 5447X 'fnmatches' Y True if the string X matches the globbing pattern 5448 denoted by Y. 5449 5450Table 4.6: Regular Expression Matching 5451 5452 The type of the regular expression used by 'matches' operator is 5453controlled by '#pragma regex' (*note pragma regex::). For example: 5454 5455 $f => "gray@gnu.org.ua" 5456 $f matches '.*@gnu\.org\.ua' => true 5457 $f matches '.*@GNU\.ORG\.UA' => false 5458 #pragma regex +icase 5459 $f matches '.*@GNU\.ORG\.UA' => true 5460 5461 The 'fnmatches' operator compares its left-hand operand with a 5462globbing pattern (see 'glob(7)') given as its right-hand side operand. 5463For example: 5464 5465 $f => "gray@gnu.org.ua" 5466 $f fnmatches "*ua" => true 5467 $f fnmatches "*org" => false 5468 $f fnmatches "*org*" => true 5469 5470 Both operators have a special form, for "'MX' pattern matching". The 5471expression: 5472 5473 X mx matches Y 5474 5475is evaluated as follows: first, the expression X is analyzed and, if it 5476is an email address, its domain part is selected. If it is not, its 5477value is used verbatim. Then the list of 'MX's for this domain is 5478looked up. Each of 'MX' names is then compared with the regular 5479expression Y. If any of the names matches, the expression returns true. 5480Otherwise, its result is false. 5481 5482 Similarly, the expression: 5483 5484 X mx fnmatches Y 5485 5486returns true only if any of the 'MX's for (domain or email) X match the 5487globbing pattern Y. 5488 5489 Both 'mx matches' and 'mx fnmatches' can signal the following 5490exceptions: 'e_temp_failure', 'e_failure'. 5491 5492 The value of any parenthesized subexpression occurring within the 5493right-hand side argument to 'matches' or 'mx matches' can be referenced 5494using the notation '\D', where D is the ordinal number of the 5495subexpression (subexpressions are numbered from left to right, starting 5496at 1). This notation is allowed in the program text as well as within 5497double-quoted strings and here-documents, for example: 5498 5499 if $f matches '.*@\(.*\)\.gnu\.org\.ua' 5500 set message "Your host name is \1;" 5501 fi 5502 5503 Remember that the grouping symbols are '\(' and '\)' for basic 5504regular expressions, and '(' and ')' for extended regular expressions. 5505Also make sure you properly escape all special characters (backslashes 5506in particular) in double-quoted strings, or use single-quoted strings to 5507avoid having to do so (*note singe-vs-double::, for a comparison of the 5508two forms). 5509 5510 5511File: mailfromd.info, Node: Boolean expressions, Next: Precedence, Prev: Special comparisons, Up: Expressions 5512 55134.14.8 Boolean Expressions 5514-------------------------- 5515 5516A "boolean expression" is a combination of relational or matching 5517expressions using the boolean operators 'and', 'or' and 'not', and, 5518eventually, parentheses to control nesting: 5519 5520Expression Result 5521-------------------------------------------------------------------------- 5522X 'and' Y True only if both X and Y are true. 5523X 'or' Y True if any of X or Y is true. 5524'not' X True if X is false. 5525 5526table 4.1: Boolean Operators 5527 5528 Binary boolean expressions are computed using "shortcut evaluation": 5529 5530'X and Y' 5531 If 'X => false', the result is 'false' and Y is not evaluated. 5532 5533'X or Y' 5534 If 'X => true', the result is 'true' and Y is not evaluated. 5535 5536 5537File: mailfromd.info, Node: Precedence, Next: Type casting, Prev: Boolean expressions, Up: Expressions 5538 55394.14.9 Operator Precedence 5540-------------------------- 5541 5542Operator "precedence" is an abstract value associated with each language 5543operator, that determines the order in which operators are executed when 5544they appear together within a single expression. Operators with higher 5545precedence are executed first. For example, '*' has a higher precedence 5546than '+', therefore the expression 'a + b * c' is evaluated in the 5547following order: first 'b' is multiplied by 'c', then 'a' is added to 5548the product. 5549 5550 When operators of equal precedence are used together they are 5551evaluated from left to right (i.e., they are "left-associative"), except 5552for comparison operators, which are non-associative (these are 5553explicitly marked as such in the table below). This means that you 5554cannot write: 5555 5556 if 5 <= x <= 10 5557 5558Instead, you should write: 5559 5560 if 5 <= x and x <= 10 5561 5562 The precedences of the 'mailfromd' operators where selected so as to 5563match that used in most programming languages.(1) 5564 5565 The following table lists all operators in order of decreasing 5566precedence: 5567 5568'(...)' 5569 Grouping 5570 5571'$ %' 5572 'Sendmail' macros and 'mailfromd' variables 5573 5574'* /' 5575 Multiplication, division 5576 5577'+ -' 5578 Addition, subtraction 5579 5580'<< >>' 5581 Bitwise shift left and right 5582 5583'< <= >= >' 5584 Relational operators (non-associative) 5585 5586'= != matches fnmatches' 5587 Equality and special comparison (non-associative) 5588 5589'&' 5590 Logical (bitwise) AND 5591 5592'^' 5593 Logical (bitwise) XOR 5594 5595'|' 5596 Logical (bitwise) OR 5597 5598'not' 5599 Boolean negation 5600 5601'and' 5602 Logical 'and'. 5603 5604'or' 5605 Logical 'or' 5606 5607'.' 5608 String concatenation 5609 5610 ---------- Footnotes ---------- 5611 5612 (1) The only exception is 'not', whose precedence in MFL is much 5613lower than usual (in most programming languages it has the same 5614precedence as unary '-'). This allows to write conditional expressions 5615in more understandable manner. Consider the following condition: 5616 5617 if not x < 2 and y = 3 5618 5619 It is understood as "if 'x' is not less than 2 and 'y' equals 3", 5620whereas with the usual precedence for 'not' it would have meant "if 5621negated 'x' is less than 2 and 'y' equals 3". 5622 5623 5624File: mailfromd.info, Node: Type casting, Prev: Precedence, Up: Expressions 5625 56264.14.10 Type Casting 5627-------------------- 5628 5629When two operands on each side of a binary expression have different 5630type, 'mailfromd' evaluator coerces them to a common type. This is 5631known as "implicit type casting". The rules for implicit type casting 5632are: 5633 5634 1. Both arguments to an arithmetical operation are cast to numeric 5635 type. 5636 5637 2. Both arguments to the concatenation operation are cast to string. 5638 5639 3. Both arguments to 'match' or 'fnmatch' function are cast to string. 5640 5641 4. The argument of the unary negation (arithmetical or boolean) is 5642 cast to numeric. 5643 5644 5. Otherwise the right-hand side argument is cast to the type of the 5645 left-hand side argument. 5646 5647 The construct for explicit type cast is: 5648 5649 TYPE(EXPR) 5650 5651where TYPE is the name of the type to coerce EXPR to. For example: 5652 5653 string(2 + 4*8) => "34" 5654 5655 5656File: mailfromd.info, Node: Shadowing, Next: Statements, Prev: Expressions, Up: MFL 5657 56584.15 Variable and Constant Shadowing 5659==================================== 5660 5661When any two named entities happen to have the same name we say that a 5662"name clash" occurs. The handling of name clashes depends on types of 5663the entities involved in it. 5664 5665function - any 5666-------------- 5667 5668A name of a constant or variable can coincide with that of a function, 5669it does not produce any warnings or errors because functions, variables 5670and constants use different namespaces. For example, the following code 5671is correct: 5672 5673 const a 4 5674 5675 func a() 5676 do 5677 echo a 5678 done 5679 5680 When executed, it prints '4'. 5681 5682function - function, handler - function, and function - handler 5683--------------------------------------------------------------- 5684 5685Redefinition of a function or using a predefined handler name (*note 5686Handlers::) as a function name results in a fatal error. For example, 5687compiling this code: 5688 5689 func a() 5690 do 5691 echo "1" 5692 done 5693 5694 func a() 5695 do 5696 echo "2" 5697 done 5698 5699causes the following error message: 5700 5701 mailfromd: sample.mf:9: syntax error, unexpected 5702 FUNCTION_PROC, expecting IDENTIFIER 5703 5704handler - variable 5705------------------ 5706 5707A variable name can coincide with a handler name. For example, the 5708following code is perfectly OK: 5709 5710 string envfrom "M" 5711 prog envfrom 5712 do 5713 echo envfrom 5714 done 5715 5716handler - handler 5717----------------- 5718 5719If two handlers with the same name are defined, the definition that 5720appears further in the source text replaces the previous one. A warning 5721message is issued, indicating locations of both definitions, e.g.: 5722 5723 mailfromd: sample.mf:116: Warning: Redefinition of handler 5724 `envfrom' 5725 mailfromd: sample.mf:34: Warning: This is the location of the 5726 previous definition 5727 5728variable - variable 5729------------------- 5730 5731Defining a variable having the same name as an already defined one 5732results in a warning message being displayed. The compilation succeeds. 5733The second variable "shadows" the first, that is any subsequent 5734references to the variable name will refer to the second variable. For 5735example: 5736 5737 string x "Text" 5738 number x 1 5739 5740 prog envfrom 5741 do 5742 echo x 5743 done 5744 5745 Compiling this code results in the following diagnostics: 5746 5747 mailfromd: sample.mf:4: Redeclaring `x' as different data type 5748 mailfromd: sample.mf:2: This is the location of the previous 5749 definition 5750 5751 Executing it prints '1', i.e. the value of the last definition of 5752'x'. 5753 5754 The scope of the shadowing depends on storage classes of the two 5755variables. If both of them have external storage class (i.e. are 5756global ones), the shadowing remains in effect until the end of input. 5757In other words, the previous definition of the variable is effectively 5758forgotten. 5759 5760 If the previous definition is a global, and the shadowing definition 5761is an automatic variable or a function parameter, the scope of this 5762shadowing ends with the scope of the second variable, after which the 5763previous definition (global) becomes visible again. Consider the 5764following code: 5765 5766 set x "initial" 5767 5768 func foo(string x) returns string 5769 do 5770 return x 5771 done 5772 5773 prog envfrom 5774 do 5775 echo foo("param") 5776 echo x 5777 done 5778 5779 Its compilation produces the following warning: 5780 5781 mailfromd: sample.mf:3: Warning: Parameter `x' is shadowing a global 5782 5783 When executed, it produces the following output: 5784 5785 param 5786 initial 5787 State envfrom: continue 5788 5789variable - constant 5790------------------- 5791 5792If a constant is defined which has the same name as a previously defined 5793variable (the constant "shadows" the variable), the compiler prints the 5794following diagnostic message: 5795 5796 FILE:LINE: Warning: Constant name `NAME' clashes with a variable name 5797 FILE:LINE: Warning: This is the location of the previous definition 5798 5799 A similar diagnostics is issued if a variable is defined whose name 5800coincides with a previously defined constant (the variable shadows the 5801constant). 5802 5803 In any case, any subsequent notation %NAME refers to the last defined 5804symbol, be it variable or constant. 5805 5806 Notice, that shadowing occurs only when using %NAME notation. 5807Referring to the constant using its name without '%' allows to avoid 5808shadowing effects. 5809 5810 If a variable shadows a constant, the scope of the shadowing depends 5811on the storage class of the variable. For automatic variables and 5812function parameters, it ends with the final 'done' closing the function. 5813For global variables, it lasts up to the end of input. 5814 5815 For example, consider the following code: 5816 5817 const a 4 5818 5819 func foo(string a) 5820 do 5821 echo a 5822 done 5823 5824 prog envfrom 5825 do 5826 foo(10) 5827 echo a 5828 done 5829 5830 When run, it produces the following output: 5831 5832 $ mailfromd --test sample.mf 5833 mailfromd: sample.mf:3: Warning: Variable name `a' clashes with a 5834 constant name 5835 mailfromd: sample.mf:1: Warning: This is the location of the previous 5836 definition 5837 10 5838 4 5839 State envfrom: continue 5840 5841constant - constant 5842------------------- 5843 5844Redefining a constant produces a warning message. The latter definition 5845shadows the former. Shadowing remains in effect until the end of input. 5846 5847 5848File: mailfromd.info, Node: Statements, Next: Conditionals, Prev: Shadowing, Up: MFL 5849 58504.16 Statements 5851=============== 5852 5853Statements are language constructs, that, unlike expressions, do not 5854return any value. Statements execute some actions, such as assigning a 5855value to a variable, or serve to control the execution flow in the 5856program. 5857 5858* Menu: 5859 5860* Actions:: Actions control the handling of the mail. 5861* Assignments:: 5862* Pass:: 5863* Echo:: 5864 5865 5866File: mailfromd.info, Node: Actions, Next: Assignments, Up: Statements 5867 58684.16.1 Action Statements 5869------------------------ 5870 5871An "action" statement instructs 'mailfromd' to perform a certain action 5872over the message being processed. There are two kinds of actions: 5873return actions and header manipulation actions. 5874 5875Reply Actions 5876............. 5877 5878Reply actions tell 'Sendmail' to return given response code to the 5879remote party. There are five such actions: 5880 5881'accept' 5882 Return an 'accept' reply. The remote party will continue 5883 transmitting its message. 5884 5885'reject CODE EXCODE MESSAGE-EXPR' 5886'reject (CODE-EXPR, EXCODE-EXPR, MESSAGE-EXPR)' 5887 Return a 'reject' reply. The remote party will have to cancel 5888 transmitting its message. The three arguments are optional, their 5889 usage is described below. 5890 5891'tempfail CODE EXCODE MESSAGE' 5892'tempfail (CODE-EXPR, EXCODE-EXPR, MESSAGE-EXPR)' 5893 Return a 'temporary failure' reply. The remote party can retry to 5894 send its message later. The three arguments are optional, their 5895 usage is described below. 5896 5897'discard' 5898 Instructs 'Sendmail' to accept the message and silently discard it 5899 without delivering it to any recipient. 5900 5901'continue' 5902 Stops the current handler and instructs 'Sendmail' to continue 5903 processing of the message. 5904 5905 Two actions, 'reject' and 'tempfail' can take up to three optional 5906parameters. There are two forms of supplying these parameters. 5907 5908 In the first form, called "literal" or "traditional" notation, the 5909arguments are supplied as additional words after the action name, and 5910are separated by whitespace. The first argument is a three-digit RFC 59112821 reply code. It must begin with '5' for 'reject' and with '4' for 5912'tempfail'. If two arguments are supplied, the second argument must be 5913either an "extended reply code" (RFC 1893/2034) or a textual string to 5914be returned along with the SMTP reply. Finally, if all three arguments 5915are supplied, then the second one must be an extended reply code and the 5916third one must give the textual string. The following examples 5917illustrate the possible ways of using the 'reject' statement: 5918 5919 reject 5920 reject 503 5921 reject 503 5.0.0 5922 reject 503 "Need HELO command" 5923 reject 503 5.0.0 "Need HELO command" 5924 5925 The notion "textual string", used above means either a literal string 5926or an MFL expression that evaluates to string. However, both code and 5927extended code must always be literal. 5928 5929 The second form of supplying arguments is called "functional" 5930notation, because it resembles the function syntax. When used in this 5931form, the action word is followed by a parenthesized group of exactly 5932three arguments, separated by commas. Each argument is a MFL 5933expression. The meaning and ordering of the arguments is the same as in 5934literal form. Any or all of these three arguments may be absent, in 5935which case it will be replaced by the default value. To illustrate 5936this, here are the statements from the previous example, written in 5937functional notation: 5938 5939 reject(,,) 5940 reject(503,,) 5941 reject(503, 5.0.0) 5942 reject(503, , "Need HELO command") 5943 reject(503, 5.0.0, "Need HELO command") 5944 5945 Notice that there is an important difference between the two 5946notations. The functional notation allows to compute both reply codes 5947at run time, e.g.: 5948 5949 reject(500 + dig2*10 + dig3, "5.%edig2.%edig2") 5950 5951Header Actions 5952.............. 5953 5954Header manipulation actions provide basic means to add, delete or modify 5955the message RFC 2822 headers. 5956 5957'add NAME STRING' 5958 Add the header NAME with the value STRING. E.g.: 5959 5960 add "X-Seen-By" "Mailfromd 8.10" 5961 5962 (notice argument quoting) 5963 5964'replace NAME STRING' 5965 The same as 'add', but if the header NAME already exists, it will 5966 be removed first, for example: 5967 5968 replace "X-Last-Processor" "Mailfromd 8.10" 5969 5970'delete NAME' 5971 Delete the header named NAME: 5972 5973 delete "X-Envelope-Date" 5974 5975 These actions impose some restrictions. First of all, their first 5976argument must be a literal string (not a variable or expression). 5977Secondly, there is no way to select a particular header instance to 5978delete or replace, which may be necessary to properly handle multiple 5979headers (e.g. 'Received'). For more elaborate ways of header 5980modifications, see *note Header modification functions::. 5981 5982 5983File: mailfromd.info, Node: Assignments, Next: Pass, Prev: Actions, Up: Statements 5984 59854.16.2 Variable Assignments 5986--------------------------- 5987 5988An "assignment" is a special statement that assigns a value to the 5989variable. It has the following syntax: 5990 5991 set NAME VALUE 5992 5993where NAME is the variable name and VALUE is the value to be assigned to 5994it. 5995 5996 Assignment statements can appear in any part of a filter program. If 5997an assignment occurs outside of function or handler definition, the 5998VALUE must be a literal value (*note Literals::). If it occurs within a 5999function or handler definition, VALUE can be any valid 'mailfromd' 6000expression (*note Expressions::). In this case, the expression will be 6001evaluated and its value will be assigned to the variable. For example: 6002 6003 set delay 150 6004 6005 prog envfrom 6006 do 6007 set delay delay * 2 6008 ... 6009 done 6010 6011 6012File: mailfromd.info, Node: Pass, Next: Echo, Prev: Assignments, Up: Statements 6013 60144.16.3 The 'pass' statement 6015--------------------------- 6016 6017The 'pass' statement has no effect. It is used in places where no 6018statement is needed, but the language syntax requires one: 6019 6020 on poll $f do 6021 when success: 6022 pass 6023 when not_found or failure: 6024 reject 550 6025 done 6026 6027 6028File: mailfromd.info, Node: Echo, Prev: Pass, Up: Statements 6029 60304.16.4 The 'echo' statement 6031--------------------------- 6032 6033The 'echo' statement concatenates all its arguments into a single string 6034and sends it to the 'syslog' using the priority 'info'. It is useful 6035for debugging your script, in conjunction with built-in constants (*note 6036Built-in constants::), for example: 6037 6038 func foo(number x) 6039 do 6040 echo "%__file__:%__line__: foo called with arg %x" 6041 ... 6042 done 6043 6044 6045File: mailfromd.info, Node: Conditionals, Next: Loops, Prev: Statements, Up: MFL 6046 60474.17 Conditional Statements 6048=========================== 6049 6050"Conditional expressions", or conditionals for short, test some 6051conditions and alter the control flow depending on the result. There 6052are two kinds of conditional statements: "if-else" branches and "switch" 6053statements. 6054 6055 The syntax of an "if-else" branching construct is: 6056 6057 if CONDITION THEN-BODY [else ELSE-BODY] fi 6058 6059Here, CONDITION is an expression that governs control flow within the 6060statement. Both THEN-BODY and ELSE-BODY are lists of 'mailfromd' 6061statements. If CONDITION is true, THEN-BODY is executed, if it is 6062false, ELSE-BODY is executed. The 'else' part of the statement is 6063optional. The condition is considered false if it evaluates to zero, 6064otherwise it is considered true. For example: 6065 6066 if $f = "" 6067 accept 6068 else 6069 reject 6070 fi 6071 6072This will accept the message if the value of the 'Sendmail' macro '$f' 6073is an empty string, and reject it otherwise. Both THEN-BODY and 6074ELSE-BODY can be compound statements including other 'if' statements. 6075Nesting level of conditional statements is not limited. 6076 6077 To facilitate writing complex conditional statements, the 'elif' 6078keyword can be used to introduce alternative conditions, for example: 6079 6080 if $f = "" 6081 accept 6082 elif $f = "root" 6083 echo "Mail from root!" 6084 else 6085 reject 6086 fi 6087 6088 Another type of branching instruction is 'switch' statement: 6089 6090 switch CONDITION 6091 do 6092 case X1 [or X2 ...]: 6093 STMT1 6094 case Y1 [or Y2 ...]: 6095 STMT2 6096 . 6097 . 6098 . 6099 [default: 6100 STMT] 6101 done 6102 6103Here, X1, X2, Y1, Y2 are literal expressions; STMT1, STMT2 and STMT are 6104arbitrary 'mailfromd' statements (possibly compound); CONDITION is the 6105controlling expression. The vertical dotted row represent another 6106eventual 'case' branches. 6107 6108 This statement is executed as follows: the CONDITION expression is 6109evaluated and if its value equals X1 or X2 (or any other X from the 6110first 'case'), then STMT1 is executed. Otherwise, if CONDITION 6111evaluates to Y1 or Y2 (or any other Y from the second 'case'), then 6112STMT2 is executed. Other 'case' branches are tried in turn. If none of 6113them matches, STMT (called the "default branch") is executed. 6114 6115 There can be as many 'case' branches as you wish. The 'default' 6116branch is optional. There can be at most one 'default' branch. 6117 6118 An example of 'switch' statement follows: 6119 6120 switch x 6121 do 6122 case 1 or 3: 6123 add "X-Branch" "1" 6124 accept 6125 case 2 or 4 or 6: 6126 add "X-Branch" "2" 6127 default: 6128 reject 6129 done 6130 6131 If the value of 'mailfromd' variable 'x' is 2 or 3, it will accept 6132the message immediately, and add a 'X-Branch: 1' header to it. If 'x' 6133equals 2 or 4 or 6, this code will add 'X-Branch: 2' header to the 6134message and will continue processing it. Otherwise, it will reject the 6135message. 6136 6137 The controlling condition of a 'switch' statement may evaluate to 6138numeric or string type. The type of the condition governs the type of 6139comparisons used in 'case' branches: for numeric types, numeric equality 6140will be used, whereas for string types, string equality is used. 6141 6142 6143File: mailfromd.info, Node: Loops, Next: Exceptions, Prev: Conditionals, Up: MFL 6144 61454.18 Loop Statements 6146==================== 6147 6148The loop statement allows for repeated execution of a block of code, 6149controlled by some conditional expression. It has the following form: 6150 6151 loop [LABEL] 6152 [for STMT1] [,while EXPR1] [,STMT2] 6153 do 6154 STMT3 6155 done [while EXPR2] 6156 6157where STMT1, STMT2, and STMT3 are statement lists, EXPR1 and EXPR2 are 6158expressions. 6159 6160 The control flow is as follows: 6161 6162 1. If STMT1 is specified, execute it. 6163 6164 2. Evaluate EXPR1. If it is zero, go to 6. Otherwise, continue. 6165 6166 3. Execute STMT3. 6167 6168 4. If STMT2 is supplied, execute it. 6169 6170 5. If EXPR2 is given, evaluate it. If it is zero, go to 6. 6171 Otherwise, go to 2. 6172 6173 6. End. 6174 6175 Thus, STMT3 is executed until either EXPR1 or EXPR2 yield a zero 6176value. 6177 6178 The "loop body" - STMT3 - can contain special statements: 6179 6180'break [LABEL]' 6181 Terminates the loop immediately. Control passes to '6' (End) in 6182 the formal definition above. If LABEL is supplied, the statement 6183 terminates the loop statement marked with that label. This allows 6184 to break from nested loops. 6185 6186 It is similar to 'break' statement in C or shell. 6187 6188'next [LABEL]' 6189 Initiates next iteration of the loop. Control passes to '4' in the 6190 formal definition above. If LABEL is supplied, the statement 6191 starts next iteration of the loop statement marked with that label. 6192 This allows to request next iteration of an upper-level loop from a 6193 nested loop statement. 6194 6195 The 'loop' statement can be used to create iterative statements of 6196arbitrary complexity. Let's illustrate it in comparison with C. 6197 6198 The statement: 6199 6200 loop 6201 do 6202 STMT-LIST 6203 done 6204 6205creates an infinite loop. The only way to exit from such a loop is to 6206call 'break' (or 'return', if used within a function), somewhere in 6207STMT-LIST. 6208 6209 The following statement is equivalent to 'while (EXPR1) STMT-LIST' in 6210C: 6211 6212 loop while EXPR 6213 do 6214 STMT-LIST 6215 done 6216 6217 The C construct 'for (EXPR1; EXPR2; EXPR3)' is written in MFL as 6218follows: 6219 6220 loop for STMT1, while EXPR2, STMT2 6221 do 6222 STMT3 6223 done 6224 6225 For example, to repeat STMT3 10 times: 6226 6227 loop for set i 0, while i < 10, set i i + 1 6228 do 6229 STMT3 6230 done 6231 6232 Finally, the C 'do' loop is implemented as follows: 6233 6234 loop 6235 do 6236 STMT-LIST 6237 done while EXPR 6238 6239 As a real-life example of a loop statement, let's consider the 6240implementation of function 'ptr_validate', which takes a single argument 6241IPSTR, and checks its validity using the following algorithm: 6242 6243 Perform a DNS reverse-mapping for IPSTR, looking up the corresponding 6244'PTR' record in 'in-addr.arpa'. For each record returned, look up its 6245IP addresses (A records). If IPSTR is among the returned IP addresses, 6246return 1 ('true'), otherwise return 0 ('false'). 6247 6248 The implementation of this function in MFL is: 6249 6250 #pragma regex push +extended 6251 6252 func ptr_validate(string ipstr) returns number 6253 do 6254 loop for string names dns_getname(ipstr) . " " 6255 number i index(names, " "), 6256 while i != -1, 6257 set names substr(names, i + 1) 6258 set i index(names, " ") 6259 do 6260 loop for string addrs dns_getaddr(substr(names, 0, i)) . " " 6261 number j index(addrs, " "), 6262 while j != -1, 6263 set addrs substr(addrs, j + 1) 6264 set j index(addrs, " ") 6265 do 6266 if ipstr == substr(addrs, 0, j) 6267 return 1 6268 fi 6269 done 6270 done 6271 return 0 6272 done 6273 6274 6275File: mailfromd.info, Node: Exceptions, Next: Polling, Prev: Loops, Up: MFL 6276 62774.19 Exceptional Conditions 6278=========================== 6279 6280When the running program encounters a condition it is not able to 6281handle, it signals an "exception". To illustrate the concept, let's 6282consider the execution of the following code fragment: 6283 6284 if primitive_hasmx(domainpart($f)) 6285 accept 6286 fi 6287 6288The function 'primitive_hasmx' (*note primitive_hasmx::) tests whether 6289the domain name given as its argument has any 'MX' records. It should 6290return a boolean value. However, when querying the Domain Name System, 6291it may fail to get a definite result. For example, the DNS server can 6292be down or temporary unavailable. In other words, 'primitive_hasmx' can 6293be in a situation when, instead of returning 'yes' or 'no', it has to 6294return 'don't know'. It has no way of doing so, therefore it signals an 6295"exception". 6296 6297 Each exception is identified by "exception type", an integer number 6298associated with it. 6299 6300* Menu: 6301 6302* Built-in Exceptions:: 6303* User-defined Exceptions:: 6304* Catch and Throw:: 6305 6306 6307File: mailfromd.info, Node: Built-in Exceptions, Next: User-defined Exceptions, Up: Exceptions 6308 63094.19.1 Built-in Exceptions 6310-------------------------- 6311 6312The first 20 exception numbers are reserved for "built-in exceptions". 6313These are declared in module 'status.mf'. The following table 6314summarizes all built-in exception types implemented by 'mailfromd' 6315version 8.10. Exceptions are listed in lexicographic order. 6316 6317'e_badmmq' 6318 The called function cannot finish its task because an uncompatible 6319 message modification function was called at some point before it. 6320 For details, *note MMQ and dkim_sign::. 6321 6322'e_dbfailure' 6323 General database failure. For example, the database cannot be 6324 opened. This exception can be signaled by any function that 6325 queries any DBM database. 6326 6327'e_divzero' 6328 Division by zero. 6329 6330'e_exists' 6331 This exception is emitted by 'dbinsert' built-in if the requested 6332 key is already present in the database (*note dbinsert: Database 6333 functions.). 6334 6335'e_eof' 6336 Function reached end of file while reading. *Note I/O functions::, 6337 for a description of functions that can signal this exception. 6338 6339'e_failure' 6340'failure' 6341'e_failure' 6342 A general failure has occurred. In particular, this exception is 6343 signaled by DNS lookup functions when any permanent failure occurs. 6344 This exception can be signaled by any DNS-related function 6345 ('hasmx', 'poll', etc.) or operation ('mx matches'). 6346 6347'e_format' 6348 Invalid input format. This exception is signaled if input data to 6349 a function are improperly formatted. In version 8.10 it is 6350 signaled by 'message_burst' function if its input message is not 6351 formatted according to RFC 934. *Note Message digest functions::. 6352 6353'e_invcidr' 6354 Invalid CIDR notation. This is signaled by 'match_cidr' function 6355 when its second argument is not a valid CIDR. 6356 6357'e_invip' 6358 Invalid IP address. This is signaled by 'match_cidr' function when 6359 its first argument is not a valid IP address. 6360 6361'e_invtime' 6362 Invalid time interval specification. It is signaled by 'interval' 6363 function if its argument is not a valid time interval (*note time 6364 interval specification::). 6365 6366'e_io' 6367 An error occurred during the input-output operation. *Note I/O 6368 functions::, for a description of functions that can signal this 6369 exception. 6370 6371'e_macroundef' 6372 A Sendmail macro is undefined. 6373 6374'e_noresolve' 6375 The argument of a DNS-related function cannot be resolved to host 6376 name or IP address. Currently only 'ismx' (*note ismx::) raises 6377 this exception. 6378 6379'e_range' 6380 The supplied argument is outside the allowed range. This is 6381 signalled, for example, by 'substring' function (*note 6382 substring::). 6383 6384'e_regcomp' 6385 Regular expression cannot be compiled. This can happen when a 6386 regular expression (a right-hand argument of a 'matches' operator) 6387 is built at the runtime and the produced string is an invalid 6388 regex. 6389 6390'e_ston_conv' 6391 String-to-number conversion failed. This can be signaled when a 6392 string is used in numeric context which cannot be converted to the 6393 numeric data type. For example: 6394 6395 set x "10a" 6396 if x / 2 6397 ... 6398 6399 The 'if' condition will signal 'ston_conv', since '10a' cannot be 6400 converted to a number. 6401 6402'e_temp_failure' 6403'temp_failure' 6404'e_temp_failure' 6405 A temporary failure has occurred. This can be signaled by 6406 DNS-related functions or operations. 6407 6408'e_url' 6409 The supplied URL is invalid. *Note Interfaces to Third-Party 6410 Programs::. 6411 6412 In addition to these, two symbols are defined that are not exception 6413types in the strict sense of the world, but are provided to make writing 6414filter scripts more convenient. These are 'success', meaning successful 6415return from a function, and 'not_found', meaning that the required 6416entity (e.g. domain name or email address) was not found. *Note Figure 64174.1: figure-poll-wrapper, for an illustration on how these can be used. 6418For consistency with other exception codes, these can be spelled as 6419'e_success' and 'e_not_found'. 6420 6421 6422File: mailfromd.info, Node: User-defined Exceptions, Next: Catch and Throw, Prev: Built-in Exceptions, Up: Exceptions 6423 64244.19.2 User-defined Exceptions 6425------------------------------ 6426 6427You can define your own exception types using the 'dclex' statement: 6428 6429 dclex TYPE 6430 6431 In this statement, TYPE must be a valid MFL identifier, not used for 6432another constant (*note Constants::). The 'dclex' statement defines a 6433new exception identified by the constant TYPE and allocates a new 6434exception number for it. 6435 6436 The TYPE can subsequently be used in 'throw' and 'catch' statements, 6437for example: 6438 6439 dclex myrange 6440 6441 number fact(number val) 6442 returns number 6443 do 6444 if val < 0 6445 throw myrange "fact argument is out of range" 6446 fi 6447 ... 6448 done 6449 6450 6451File: mailfromd.info, Node: Catch and Throw, Prev: User-defined Exceptions, Up: Exceptions 6452 64534.19.3 Exception Handling 6454------------------------- 6455 6456Normally when an exception is signalled, the program execution is 6457terminated and the MTA is returned a 'tempfail' status. Additional 6458information regarding the exception is then output to the logging 6459channel (*note Logging and Debugging::). However, the user can 6460intercept any exception by installing his own exception-handling 6461routines. 6462 6463 An exception-handling routine is introduced by a "try-catch" 6464statement, which has the following syntax: 6465 6466 try 6467 do 6468 STMTLIST 6469 done 6470 catch EXCEPTION-LIST 6471 do 6472 HANDLER-BODY 6473 done 6474 6475where STMTLIST and HANDLER-BODY are sequences of MFL statements and 6476EXCEPTION-LIST is the list of exception types, separated by the word 6477'or'. A special EXCEPTION-LIST '*' is allowed and means all exceptions. 6478 6479 This construct works as follows. First, the statements from STMTLIST 6480are executed. If the execution finishes successfully, control is passed 6481to the first statement after the 'catch' block. Otherwise, if an 6482exception is signalled and this exception is listed in EXCEPTION-LIST, 6483the execution is passed to the HANDLER-BODY. If the exception is not 6484listed in EXCEPTION-LIST, it is handled as usual. 6485 6486 The following example shows a 'try--catch' construct used for 6487handling eventual exceptions, signalled by 'primitive_hasmx'. 6488 6489 try 6490 do 6491 if primitive_hasmx(domainpart($f)) 6492 accept 6493 else 6494 reject 6495 fi 6496 done 6497 catch e_failure or e_temp_failure 6498 do 6499 echo "primitive_hasmx failed" 6500 continue 6501 done 6502 6503 The 'try--catch' statement can appear anywhere inside a function or a 6504handler, but it cannot appear outside of them. It can also be nested 6505within another 'try--catch', in either of its parts. Upon exit from a 6506function or milter handler, all exceptions are restored to the state 6507they had when it has been entered. 6508 6509 A 'catch' block can also be used alone, without preceding 'try' part. 6510Such a construct is called a "standalone catch". It is mostly useful 6511for setting global exception handlers in a 'begin' statement (*note 6512begin/end::). When used within a usual function or handler, the 6513exception handlers set by a standalone catch remain in force until 6514either another standalone catch appears further in the same function or 6515handler, or an end of the function is encountered, whichever occurs 6516first. 6517 6518 A standalone catch defined within a function must return from it by 6519executing 'return' statement. If it does not do that explicitly, the 6520default value of 1 is returned. A standalone catch defined within a 6521milter handler must end execution with any of the following actions: 6522'accept', 'continue', 'discard', 'reject', 'tempfail'. By default, 6523'continue' is used. 6524 6525 It is not recommended to mix 'try--catch' constructs and standalone 6526catches. If a standalone catch appears within a 'try--catch' statement, 6527its scope of visibility is undefined. 6528 6529 Upon entry to a HANDLER-BODY, two implicit positional arguments are 6530defined, which can be referenced in HANDLER-BODY as '$1' and '$2'. The 6531first argument gives the numeric code of the exception that has 6532occurred. The second argument is a textual string containing a 6533human-readable description of the exception. 6534 6535 The following is an improved version of the previous example, which 6536uses these parameters to supply more information about the failure: 6537 6538 try 6539 do 6540 if primitive_hasmx(domainpart($f)) 6541 accept 6542 else 6543 reject 6544 fi 6545 done 6546 catch e_failure or e_temp_failure 6547 do 6548 echo "Caught exception $1: $2" 6549 continue 6550 done 6551 6552 The following example defines the function 'hasmx' that returns true 6553if the domain part of its argument has any 'MX' records, and false if it 6554does not or if an exception occurs (1). 6555 6556 func hasmx (string s) 6557 returns number 6558 do 6559 try 6560 do 6561 return primitive_hasmx(domainpart(s)) 6562 done 6563 catch * 6564 do 6565 return 0 6566 done 6567 done 6568 6569 The same function can written using standalone 'catch': 6570 6571 func hasmx (string s) 6572 returns number 6573 do 6574 catch * 6575 do 6576 return 0 6577 done 6578 return primitive_hasmx(domainpart(s)) 6579 done 6580 6581 All variables remain visible within 'catch' body, with the exception 6582of positional arguments of the enclosing handler. To access positional 6583arguments of a handler from the 'catch' body, assign them to local 6584variables prior to the 'try--catch' construct, e.g.: 6585 6586 prog header 6587 do 6588 string hname $1 6589 string hvalue $2 6590 try 6591 do 6592 ... 6593 done 6594 catch * 6595 do 6596 echo "Exception $1 while processing header %hname: %hvalue" 6597 echo $2 6598 tempfail 6599 done 6600 6601 You can also generate (or "raise") exceptions explicitly in the code, 6602using 'throw' statement: 6603 6604 throw EXCODE DESCR 6605 6606 The arguments correspond exactly to the positional parameters of the 6607'catch' statement: EXCODE gives the numeric code of the exception, DESCR 6608gives its textual description. This statement can be used in complex 6609scripts to create non-local exits from deeply nested statements. 6610 6611 Notice, that the the EXCODE argument must be an immediate value: an 6612exception identifier (either a built-in one or one declared previously 6613using a 'dclex' statement). 6614 6615 ---------- Footnotes ---------- 6616 6617 (1) This function is part of the 'mailfromd' library, *Note hasmx::. 6618 6619 6620File: mailfromd.info, Node: Polling, Next: Modules, Prev: Exceptions, Up: MFL 6621 66224.20 Sender Verification Tests 6623============================== 6624 6625The filter script language provides a wide variety of functions for 6626sender address verification or "polling", for short. These functions, 6627which were described in *note SMTP Callout functions::, can be used to 6628implement any sender verification method. The additional data that can 6629be needed is normally supplied by two global variables: 'ehlo_domain', 6630keeping the default domain for the 'EHLO' command, and 6631'mailfrom_address', which stores the sender address for probe messages 6632(*note Predefined variables::). 6633 6634 For example, a simplest way to implement standard polling would be: 6635 6636 prog envfrom 6637 do 6638 if stdpoll($1, ehlo_domain, mailfrom_address) == 0 6639 accept 6640 else 6641 reject 550 5.1.0 "Sender validity not confirmed" 6642 fi 6643 done 6644 6645 However, this does not take into account exceptions that 'stdpoll' 6646can signal. To handle them, one will have to use 'catch', for example 6647thus: 6648 6649 require status 6650 6651 prog envfrom 6652 do 6653 try 6654 do 6655 if stdpoll($1, ehlo_domain, mailfrom_address) == 0 6656 accept 6657 else 6658 reject 550 5.1.0 "Sender validity not confirmed" 6659 fi 6660 done 6661 catch e_failure or e_temp_failure 6662 do 6663 switch $1 6664 do 6665 case failure: 6666 reject 550 5.1.0 "Sender validity not confirmed" 6667 case temp_failure: 6668 tempfail 450 4.1.0 "Try again later" 6669 done 6670 done 6671 done 6672 6673 If polls are used often, one can define a wrapper function, and use 6674it instead. The following example illustrates this approach: 6675 6676 func poll_wrapper(string email) returns number 6677 do 6678 catch e_failure or e_temp_failure 6679 do 6680 return email 6681 done 6682 return stdpoll(email, ehlo_domain, mailfrom_address) 6683 done 6684 6685 prog envfrom 6686 do 6687 switch poll_wrapper($f) 6688 do 6689 case success: 6690 accept 6691 case not_found or failure: 6692 reject 550 5.1.0 "Sender validity not confirmed" 6693 case temp_failure: 6694 tempfail 450 4.1.0 "Try again later" 6695 done 6696 done 6697 6698Figure 4.1: Building Poll Wrappers 6699 6700 Notice the way 'envfrom' handles 'success' and 'not_found', which are 6701not exceptions in the strict sense of the word. 6702 6703 The above paradigm is so common that 'mailfromd' provides a special 6704language construct to simplify it: the 'on' statement. Instead of 6705manually writing the wrapper function and using it as a 'switch' 6706condition, you can rewrite the above example as: 6707 6708 prog envfrom 6709 do 6710 on stdpoll($1, ehlo_domain, mailfrom_address) 6711 do 6712 when success: 6713 accept 6714 when not_found or failure: 6715 reject 550 5.1.0 "Sender validity not confirmed" 6716 when temp_failure: 6717 tempfail 450 4.1.0 "Try again later" 6718 done 6719 done 6720 6721Figure 4.2: Standard poll example 6722 6723As you see the statement is pretty similar to 'switch'. The major 6724syntactic difference is the use of the keyword 'when' to introduce 6725conditional branches. 6726 6727 General syntax of the 'on' statement is: 6728 6729 on CONDITION 6730 do 6731 when X1 [or X2 ...]: 6732 STMT1 6733 when Y1 [or Y2 ...]: 6734 STMT2 6735 . 6736 . 6737 . 6738 done 6739 6740The CONDITION is either a function call or a special 'poll' statement 6741(see below). The values used in 'when' branches are normally symbolic 6742exception names (*note exception names::). 6743 6744 When the compiler processes the 'on' statement it does the following: 6745 6746 1. Builds a unique wrapper function, similar to that described in 6747 *note Figure 4.1: figure-poll-wrapper.; The name of the function is 6748 constructed from the CONDITION function name and an unsigned 6749 number, called "exception mask", that is unique for each 6750 combination of exceptions used in 'when' branches; To avoid name 6751 clashes with the user-defined functions, the wrapper name begins 6752 and ends with '$' which normally is not allowed in the identifiers; 6753 6754 2. Translates the 'on' body to the corresponding 'switch' statement; 6755 6756 A special form of the CONDITION is 'poll' keyword, whose syntax is: 6757 6758 poll [for] EMAIL 6759 [host HOST] 6760 [from DOMAIN] 6761 [as EMAIL] 6762 6763 The order of particular keywords in the 'poll' statement is 6764arbitrary, for example 'as EMAIL' can appear before EMAIL as well as 6765after it. 6766 6767 The simplest form, 'poll EMAIL', performs the standard sender 6768verification of email address EMAIL. It is translated to the following 6769function call: 6770 6771 stdpoll(EMAIL, ehlo_domain, mailfrom_address) 6772 6773 The construct 'poll EMAIL host HOST', runs the strict sender 6774verification of address EMAIL on the given host. It is translated to 6775the following call: 6776 6777 strictpoll(HOST, EMAIL, ehlo_domain, mailfrom_address) 6778 6779 Other keywords of the 'poll' statement modify these two basic forms. 6780The 'as' keyword introduces the email address to be used in the SMTP 6781'MAIL FROM' command, instead of 'mailfrom_address'. The 'from' keyword 6782sets the domain name to be used in 'EHLO' command. So, for example the 6783following construct: 6784 6785 poll EMAIL host HOST from DOMAIN as ADDR 6786 6787is translated to 6788 6789 strictpoll(HOST, EMAIL, DOMAIN, ADDR) 6790 6791 To summarize the above, the code described in *note Figure 4.2: 6792figure-stdpoll. can be written as: 6793 6794 prog envfrom 6795 do 6796 on poll $f do 6797 when success: 6798 accept 6799 when not_found or failure: 6800 reject 550 5.1.0 "Sender validity not confirmed" 6801 when temp_failure: 6802 tempfail 450 4.1.0 "Try again later" 6803 done 6804 done 6805 6806 6807File: mailfromd.info, Node: Modules, Next: Preprocessor, Prev: Polling, Up: MFL 6808 68094.21 Modules 6810============ 6811 6812A "module" is a logically isolated part of code that implements a 6813separate concern or feature and contains a collection of conceptually 6814united functions and/or data. Each module occupies a separate 6815compilation unit (i.e. file). The functionality provided by a module 6816is incorporated into another module or the main program by "requiring" 6817this module or by "importing" the desired components from it. 6818 6819* Menu: 6820 6821* module structure:: Declaring Modules 6822* scope of visibility:: 6823* import:: Require and Import 6824 6825 6826File: mailfromd.info, Node: module structure, Next: scope of visibility, Up: Modules 6827 68284.21.1 Declaring Modules 6829------------------------ 6830 6831A module file must begin with a "module declaration": 6832 6833 module MODNAME [INTERFACE-TYPE]. 6834 6835 Note the final dot. 6836 6837 The MODNAME parameter declares the name of the module. It is 6838recommended that it be the same as the file name without the '.mf' 6839extension. The module name must be a valid MFL literal. It also must 6840not coincide with any defined MFL symbol, therefore we recommend to 6841always quote it (see example below). 6842 6843 The optional parameter INTERFACE-TYPE defines the "default scope of 6844visibility" for the symbols declared in this module. If it is 'public', 6845then all symbols declared in this module are made public (importable) by 6846default, unless explicitly declared otherwise (*note scope of 6847visibility::). If it is 'static', then all symbols, not explicitly 6848marked as public, become static. If the INTERFACE-TYPE is not given, 6849'public' is assumed. 6850 6851 The actual MFL code follows the 'module' line. 6852 6853 The module definition is terminated by the "logical end" of its 6854compilation unit, i.e. either by the end of file, or by the keyword 6855'bye', whichever occurs first. 6856 6857 Special keyword 'bye' may be used to prematurely end the current 6858compilation unit before the physical end of the containing file. Any 6859material between 'bye' and the end of file is ignored by the compiler. 6860 6861 Let's illustrate these concepts by writing a module 'revip': 6862 6863 module 'revip' public. 6864 6865 func revip(string ip) 6866 returns string 6867 do 6868 return inet_ntoa(ntohl(inet_aton(ip))) 6869 done 6870 6871 bye 6872 6873 This text is ignored. You may put any additional 6874 documentation here. 6875 6876 6877File: mailfromd.info, Node: scope of visibility, Next: import, Prev: module structure, Up: Modules 6878 68794.21.2 Scope of Visibility 6880-------------------------- 6881 6882"Scope of Visibility" of a symbol defines from where this symbol may be 6883referred to. Symbols in MFL may have either of the following two 6884scopes: 6885 6886"Public" 6887 Public symbols are visible from the current module, as well as from 6888 any external modules, including the main script file, provided that 6889 they are properly imported (*note import::). 6890 6891"Static" 6892 Static symbols are visible only from the current module. There is 6893 no way to refer to them from outside. 6894 6895 The default scope of visibility for all symbols declared within a 6896module is defined in the module declaration (*note module structure::). 6897It may be overridden for any individual symbol by prefixing its 6898declaration with an appropriate "qualifier": either 'public' or 6899'static'. 6900 6901 6902File: mailfromd.info, Node: import, Prev: scope of visibility, Up: Modules 6903 69044.21.3 Require and Import 6905------------------------- 6906 6907Functions or variables declared in another module must be "imported" 6908prior to their actual use. MFL provides two ways of doing so: by 6909"requiring" the entire module or by importing selected symbols from it. 6910 6911 -- Module Import: require modname 6912 The 'require' statement instructs the compiler to locate the module 6913 MODNAME and to load all public interfaces from it. 6914 6915 The compiler looks for the file 'MODNAME.mf' in the current search 6916path (*note include search path::). If no such file is found, a 6917compilation error is reported. 6918 6919 For example, the following statement: 6920 6921 require revip 6922 6923imports all interfaces from the module 'revip.mf'. 6924 6925 Another, more sophisticated way to import from a module is to use the 6926'from ... import' construct: 6927 6928 from MODULE import SYMBOLS. 6929 6930 Note the final dot. The 'from' and 'module' statements are the only 6931two constructs in MFL that require the delimiter. 6932 6933 The MODULE has the same semantics as in the 'require' construct. The 6934SYMBOLS is a comma-separated list of symbol names to import from MODULE. 6935A symbol name may be given in several forms: 6936 6937 1. Literal 6938 6939 Literals specify exact symbol names to import. For example, the 6940 following statement imports from module 'A.mf' symbols 'foo' and 6941 'bar': 6942 6943 from A import foo,bar. 6944 6945 2. Regular expression 6946 6947 Regular expressions must be surrounded by slashes. A regular 6948 expression instructs the compiler to import all symbols whose names 6949 match that expression. For example, the following statement 6950 imports from 'A.mf' all symbols whose names begin with 'foo' and 6951 contain at least one digit after it: 6952 6953 from A import '/^foo.*[0-9]/'. 6954 6955 The type of regular expressions used in the 'from' statement is 6956 controlled by '#pragma regex' (*note regex::). 6957 6958 3. Regular expression with transformation 6959 6960 Regular expression may be followed by a "s-expression", i.e. a 6961 'sed'-like expression of the form: 6962 6963 s/REGEXP/REPLACE/[FLAGS] 6964 6965 where REGEXP is a "regular expression", REPLACE is a replacement 6966 for each part of the input that matches REGEXP. S-expressions and 6967 their parts are discussed in detail in *note s-expression::. 6968 6969 The effect of such construct is to import all symbols that match 6970 the regular expression and apply the s-expression to their names. 6971 6972 For example: 6973 6974 from A import '/^foo.*[0-9]/s/.*/my_&/'. 6975 6976 This statement imports all symbols whose names begin with 'foo' and 6977 contain at least one digit after it, and renames them, by prefixing 6978 their names with the string 'my_'. Thus, if 'A.mf' declared a 6979 function 'foo_1', it becomes visible under the name of 'my_foo_1'. 6980 6981 6982File: mailfromd.info, Node: Preprocessor, Next: Filter Script Example, Prev: Modules, Up: MFL 6983 69844.22 MFL Preprocessor 6985===================== 6986 6987Before compiling the script file, 'mailfromd' preprocesses it. The 6988built-in preprocessor handles only file inclusion (*note include::), 6989while the rest of traditional facilities, such as macro expansion, are 6990supported via 'm4', which is used as an external preprocessor. 6991 6992 The detailed description of 'm4' facilities lies far beyond the scope 6993of this document. You will find a complete user manual in *note GNU M4 6994manual: (m4)Top. For the rest of this section we assume the reader is 6995sufficiently acquainted with 'm4' macro processor. 6996 6997 The external preprocessor is invoked with '-s' flag, instructing it 6998to include line synchronization information in its output, which is 6999subsequently used by MFL compiler for purposes of error reporting. The 7000initial set of macro definitions is supplied in file 'pp-setup', located 7001in the library search path(1), which is fed to the preprocessor input 7002before the script file itself. The default 'pp-setup' file renames all 7003'm4' built-in macro names so they all start with the prefix 'm4_'(2). 7004It changes comment characters to '/*', '*/' pair, and leaves the default 7005quoting characters, grave ('`') and acute (''') accents without change. 7006Finally, 'pp-setup' defines the following macros: 7007 7008 -- M4 Macro: boolean defined (IDENTIFIER) 7009 The IDENTIFIER must be the name of an optional abstract argument to 7010 the function. This macro must be used only within a function 7011 definition. It expands to the MFL expression that yields 'true' if 7012 the actual parameter is supplied for IDENTIFIER. For example: 7013 7014 func rcut(string text; number num) 7015 returns string 7016 do 7017 if (defined(num)) 7018 return substr(text, length(text) - num) 7019 else 7020 return text 7021 fi 7022 done 7023 7024 This function will return last NUM characters of TEXT if NUM is 7025 supplied, and entire TEXT otherwise, e.g.: 7026 7027 rcut("text string") => "text string" 7028 rcut("text string", 3) => "ing" 7029 7030 Invoking the 'defined' macro with the name of a mandatory argument 7031 yields 'true' 7032 7033 -- M4 Macro: printf (FORMAT, ...) 7034 Provides a 'printf' statement, that formats its optional parameters 7035 in accordance with FORMAT and sends the resulting string to the 7036 current log output (*note Logging and Debugging::). *Note String 7037 formatting::, for a description of FORMAT. 7038 7039 Example usage: 7040 7041 printf('Function %s returned %d', funcname, retcode) 7042 7043 -- M4 Macro: string _ (MSGID) 7044 A convenience macro. Expands to a call to 'gettext' (*note NLS 7045 Functions::). 7046 7047 -- M4 Macro: string_list_iterate (LIST, DELIM, VAR, CODE) 7048 This macro intends to compensate for the lack of array data type in 7049 MFL. It splits the string LIST into segments delimited by string 7050 DELIM. For each segment, the MFL code CODE is executed. The code 7051 can use the variable VAR to refer to the segment string. 7052 7053 For example, the following fragment prints names of all existing 7054 directories listed in the 'PATH' environment variable: 7055 7056 string path getenv("PATH") 7057 string seg 7058 7059 string_list_iterate(path, ":", seg, ` 7060 if access(seg, F_OK) 7061 echo "%seg exists" 7062 fi') 7063 7064 Care should be taken to properly quote its arguments. In the code 7065 below the string 'str' is treated as a comma-separated list of 7066 values. To avoid interpreting the comma as argument delimiter the 7067 second argument must be quoted: 7068 7069 string_list_iterate(str, `","', seg, ` 7070 echo "next segment: " . seg') 7071 7072 -- M4 Macro: N_ (MSGID) 7073 A convenience macro, that expands to MSGID verbatim. It is 7074 intended to mark the literal strings that should appear in the 7075 '.po' file, where actual call to 'gettext' (*note NLS Functions::) 7076 cannot be used. For example: 7077 7078 /* Mark the variable for translation: cannot use gettext here */ 7079 string message N_("Mail accepted") 7080 7081 prog envfrom 7082 do 7083 ... 7084 /* Translate and log the message */ 7085 echo gettext(message) 7086 7087 You can obtain the preprocessed output, without starting actual 7088compilation, using '-E' command line option: 7089 7090 $ mailfromd -E file.mf 7091 7092 The output is in the form of preprocessed source code, which is sent 7093to the standard output. This can be useful, among others, to debug your 7094own macro definitions. 7095 7096 Macro definitions and deletions can be made on the command line, by 7097using the '-D' and '-U' options. They have the following format: 7098 7099'-D NAME[=VALUE]' 7100'--define=NAME[=VALUE]' 7101 Define a symbol NAME to have a value VALUE. If VALUE is not 7102 supplied, the value is taken to be the empty string. The VALUE can 7103 be any string, and the macro can be defined to take arguments, just 7104 as if it was defined from within the input using the 'm4_define' 7105 statement. 7106 7107 For example, the following invocation defines symbol 'COMPAT' to 7108 have a value '43': 7109 7110 $ mailfromf -DCOMPAT=43 7111 7112'-U NAME' 7113'--undefine=NAME' 7114 A counterpart of the '-D' option is the option '-U' ('--undefine'). 7115 It undefines a preprocessor symbol whose name is given as its 7116 argument. The following example undefines the symbol 'COMPAT': 7117 7118 $ mailfromf -UCOMPAT 7119 7120 The following two options are supplied mainly for debugging purposes: 7121 7122'--no-preprocessor' 7123 Disables the external preprocessor. 7124 7125'--preprocessor=COMMAND' 7126 Use COMMAND as external preprocessor. Be especially careful with 7127 this option, because 'mailfromd' cannot verify whether COMMAND is 7128 actually some kind of a preprocessor or not. 7129 7130 ---------- Footnotes ---------- 7131 7132 (1) It is usually located in 7133'/usr/local/share/mailfromd/8.10/include/pp-setup'. 7134 7135 (2) This is similar to GNU m4 '--prefix-builtin' options. This 7136approach was chosen to allow for using non-GNU 'm4' implementations as 7137well. 7138 7139 7140File: mailfromd.info, Node: Filter Script Example, Next: Reserved Words, Prev: Preprocessor, Up: MFL 7141 71424.23 Example of a Filter Script File 7143==================================== 7144 7145In this section we will discuss a working example of the filter script 7146file. For the ease of illustration, it is divided in several sections. 7147Each section is prefaced with a comment explaining its function. 7148 7149 This filter assumes that the 'mailfromd.conf' file contains the 7150following: 7151 7152 relayed-domain-file (/etc/mail/sendmail.cw, 7153 /etc/mail/relay-domains); 7154 io-timeout 33; 7155 database cache { 7156 negative-expire-interval 1 day; 7157 positive-expire-interval 2 weeks; 7158 }; 7159 7160 Of course, the exact parameter settings may vary, what is important 7161is that they be declared. *Note Mailfromd Configuration::, for a 7162description of 'mailfromd' configuration file syntax. 7163 7164 Now, let's return to the script. Its first part defines the 7165configuration settings for this host: 7166 7167 #pragma regex +extended +icase 7168 7169 set mailfrom_address "<>" 7170 set ehlo_domain "gnu.org.ua" 7171 7172 The second part loads the necessary source modules: 7173 7174 require 'status' 7175 require 'dns' 7176 require 'rateok' 7177 7178 Next we define 'envfrom' handler. In the first two rules, it accepts 7179all mails coming from the null address and from the machines which we 7180relay: 7181 7182 prog envfrom 7183 do 7184 if $f = "" 7185 accept 7186 elif relayed hostname($client_addr) 7187 accept 7188 elif hostname($client_addr) = $client_addr 7189 reject 550 5.7.7 "IP address does not resolve" 7190 7191 Next rule rejects all messages coming from hosts with dynamic IP 7192addresses. A regular expression used to catch such hosts is not 100% 7193fail-proof, but it tries to cover most existing host naming patterns: 7194 7195 elif hostname($client_addr) matches 7196 ".*(adsl|sdsl|hdsl|ldsl|xdsl|dialin|dialup|\ 7197 ppp|dhcp|dynamic|[-.]cpe[-.]).*" 7198 reject 550 5.7.1 "Use your SMTP relay" 7199 7200 Messages coming from the machines whose host names contain something 7201similar to an IP are subject to strict checking: 7202 7203 elif hostname($client_addr) matches 7204 ".*[0-9]{1,3}[-.][0-9]{1,3}[-.][0-9]{1,3}[-.][0-9]{1,3}.*" 7205 on poll host $client_addr for $f do 7206 when success: 7207 pass 7208 when not_found or failure: 7209 reject 550 5.1.0 "Sender validity not confirmed" 7210 when temp_failure: 7211 tempfail 7212 done 7213 7214 If the sender domain is relayed by any of the 'yahoo.com' or 7215'nameserver.com' 'MX's, no checks are performed. We will greylist this 7216message in 'envrcpt' handler: 7217 7218 elif $f mx fnmatches "*.yahoo.com" 7219 or $f mx fnmatches "*.namaeserver.com" 7220 pass 7221 7222 Finally, if the message does not meet any of the above conditions, it 7223is verified by the standard procedure: 7224 7225 else 7226 on poll $f do 7227 when success: 7228 pass 7229 when not_found or failure: 7230 reject 550 5.1.0 "Sender validity not confirmed" 7231 when temp_failure: 7232 tempfail 7233 done 7234 fi 7235 7236 At the end of the handler we check if the sender-client pair does not 7237exceed allowed mail sending rate: 7238 7239 if not rateok("$f-$client_addr", interval("1 hour 30 minutes"), 100) 7240 tempfail 450 4.7.0 "Mail sending rate exceeded. Try again later" 7241 fi 7242 done 7243 7244 Next part defines the 'envrcpt' handler. Its primary purpose is to 7245greylist messages from some domains that could not be checked otherwise: 7246 7247 prog envrcpt 7248 do 7249 set gltime 300 7250 if $f mx fnmatches "*.yahoo.com" 7251 or $f mx fnmatches "*.namaeserver.com" 7252 and not dbmap("/var/run/whitelist.db", $client_addr) 7253 if greylist("$client_addr-$f-$rcpt_addr", gltime) 7254 if greylist_seconds_left = gltime 7255 tempfail 450 4.7.0 7256 "You are greylisted for %gltime seconds" 7257 else 7258 tempfail 450 4.7.0 7259 "Still greylisted for " . 7260 %greylist_seconds_left . " seconds" 7261 fi 7262 fi 7263 fi 7264 done 7265 7266 7267File: mailfromd.info, Node: Reserved Words, Prev: Filter Script Example, Up: MFL 7268 72694.24 Reserved Words 7270=================== 7271 7272For your reference, here is an alphabetical list of all reserved words: 7273 7274 * __defpreproc__ 7275 * __defstatedir__ 7276 * __file__ 7277 * __function__ 7278 * __line__ 7279 * __major__ 7280 * __minor__ 7281 * __module__ 7282 * __package__ 7283 * __patch__ 7284 * __preproc__ 7285 * __statedir__ 7286 * __version__ 7287 * accept 7288 * add 7289 * and 7290 * alias 7291 * begin 7292 * break 7293 * bye 7294 * case 7295 * catch 7296 * const 7297 * continue 7298 * default 7299 * delete 7300 * discard 7301 * do 7302 * done 7303 * echo 7304 * end 7305 * elif 7306 * else 7307 * fi 7308 * fnmatches 7309 * for 7310 * from 7311 * func 7312 * if 7313 * import 7314 * loop 7315 * matches 7316 * module 7317 * next 7318 * not 7319 * number 7320 * on 7321 * or 7322 * pass 7323 * precious 7324 * prog 7325 * public 7326 * reject 7327 * replace 7328 * return 7329 * returns 7330 * require 7331 * set 7332 * static 7333 * string 7334 * switch 7335 * tempfail 7336 * throw 7337 * try 7338 * vaptr 7339 * when 7340 * while 7341 7342 Several keywords are context-dependent: 'mx' is a keyword if it 7343appears before 'matches' or 'fnmatches'. Following strings are keywords 7344in 'on' context: 7345 7346 * as 7347 * host 7348 * poll 7349 7350 The following keywords are preprocessor macros: 7351 7352 * defined 7353 * _ (an underscore) 7354 * N_ 7355 7356 Any keyword beginning with a 'm4_' prefix is a reserved preprocessor 7357symbol. 7358 7359 7360File: mailfromd.info, Node: Library, Next: Using MFL Mode, Prev: MFL, Up: Top 7361 73625 The MFL Library Functions 7363*************************** 7364 7365This chapter describes library functions available in Mailfromd version 73668.10. For the simplicity of explanation, we use the word 'boolean' to 7367indicate variables of numeric type that are used as boolean values. For 7368such variables, the term 'False' stands for the numeric 0, and 'True' 7369for any non-zero value. 7370 7371* Menu: 7372 7373* Macro access:: 7374* String transformation:: 7375* String manipulation:: 7376* String formatting:: 7377* Character Type:: 7378* Email processing functions:: 7379* Envelope modification functions:: 7380* Header modification functions:: 7381* Body Modification Functions:: 7382* Message modification queue:: 7383* Mail header functions:: 7384* Mail body functions:: 7385* EOM Functions:: 7386* Current Message Functions:: 7387* Mailbox functions:: 7388* Message functions:: 7389* Quarantine functions:: 7390* SMTP Callout functions:: 7391* Compatibility Callout functions:: 7392* Internet address manipulation functions:: 7393* DNS functions:: 7394* Geolocation functions:: 7395* Database functions:: 7396* I/O functions:: 7397* System functions:: 7398* Passwd functions:: 7399* Sieve Interface:: 7400* Interfaces to Third-Party Programs:: 7401* Rate limiting functions:: 7402* Greylisting functions:: 7403* Special test functions:: 7404* Mail Sending Functions:: 7405* Blacklisting Functions:: 7406* SPF Functions:: 7407* DKIM:: 7408* Sockmaps:: 7409* NLS Functions:: 7410* Syslog Interface:: 7411* Debugging Functions:: 7412 7413 7414File: mailfromd.info, Node: Macro access, Next: String transformation, Up: Library 7415 74165.1 Sendmail Macro Access Functions 7417=================================== 7418 7419 -- Built-in Function: string getmacro (string MACRO) 7420 Returns the value of Sendmail macro MACRO. If MACRO is not 7421 defined, raises the 'e_macroundef' exception. 7422 7423 Calling 'getmacro(NAME)' is completely equivalent to referencing 7424 '${NAME}', except that it allows to construct macro names 7425 programmatically, e.g.: 7426 7427 if getmacro("auth_%var") = "foo" 7428 ... 7429 fi 7430 7431 -- Built-in Function: boolean macro_defined (string NAME) 7432 Return true if Sendmail macro NAME is defined. 7433 7434 Notice, that if your MTA supports macro name negotiation(1), you will 7435have to export macro names used by these two functions using '#pragma 7436miltermacros' construct. Consider this example: 7437 7438 func authcheck(string name) 7439 do 7440 string macname "auth_%name" 7441 if macro_defined(macname) 7442 if getmacro(macname) 7443 ... 7444 fi 7445 fi 7446 done 7447 7448 #pragma miltermacros envfrom auth_authen 7449 7450 prog envfrom 7451 do 7452 authcheck("authen") 7453 done 7454 7455 In this case, the parser cannot deduce that the 'envfrom' handler 7456will attempt to reference the 'auth_authen' macro, therefore the 7457'#pragma miltermacros' is used to help it. 7458 7459 ---------- Footnotes ---------- 7460 7461 (1) That is, if it supports Milter protocol 6 and upper. Sendmail 74628.14.0 and Postfix 2.6 and newer do. MeTA1 (via 'pmult') does as well. 7463*Note MTA Configuration::, for more details. 7464 7465 7466File: mailfromd.info, Node: String transformation, Next: String manipulation, Prev: Macro access, Up: Library 7467 74685.2 The 'sed' function 7469====================== 7470 7471The 'sed' function allows you to transform a string by replacing parts 7472of it that match a regular expression with another string. This 7473function is somewhat similar to the 'sed' command line utility (hence 7474its name) and bears similarities to analogous functions in other 7475programming languages (e.g. 'sub' in 'awk' or the 's//' operator in 7476'perl'). 7477 7478 -- Built-in Function: string sed (string SUBJECT, EXPR, ...) 7479 The EXPR argument is an "s-expressions" of the the form: 7480 7481 s/REGEXP/REPLACEMENT/[FLAGS] 7482 7483 where REGEXP is a "regular expression", and REPLACEMENT is a 7484 replacement string for each part of the SUBJECT that matches 7485 REGEXP. When 'sed' is invoked, it attempts to match SUBJECT 7486 against the REGEXP. If the match succeeds, the portion of SUBJECT 7487 which was matched is replaced with REPLACEMENT. Depending on the 7488 value of FLAGS (*note global replace::), this process may continue 7489 until the entire SUBJECT has been scanned. 7490 7491 The resulting output serves as input for next argument, if such is 7492 supplied. The process continues until all arguments have been 7493 applied. 7494 7495 The function returns the output of the last s-expression. 7496 7497 Both REGEXP and REPLACEMENT are described in detail in *note The "s" 7498Command: (sed)The "s" Command. 7499 7500 Supported FLAGS are: 7501 7502'g' 7503 Apply the replacement to _all_ matches to the REGEXP, not just the 7504 first. 7505 7506'i' 7507 Use case-insensitive matching. In the absense of this flag, the 7508 value set by the recent '#pragma regex icase' is used (*note icase: 7509 pragma regex.). 7510 7511'x' 7512 REGEXP is an "extended regular expression" (*note Extended regular 7513 expressions: (sed)Extended regexps.). In the absense of this flag, 7514 the value set by the recent '#pragma regex extended' (if any) is 7515 used (*note extended: pragma regex.). 7516 7517'NUMBER' 7518 Only replace the NUMBERth match of the REGEXP. 7519 7520 Note: the POSIX standard does not specify what should happen when 7521 you mix the 'g' and NUMBER modifiers. 'Mailfromd' follows the GNU 7522 'sed' implementation in this regard, so the interaction is defined 7523 to be: ignore matches before the NUMBERth, and then match and 7524 replace all matches from the NUMBERth on. 7525 7526 Any delimiter can be used in lieue of '/', the only requirement being 7527that it be used consistently throughout the expression. For example, 7528the following two expressions are equivalent: 7529 7530 s/one/two/ 7531 s,one,two, 7532 7533 Changing delimiters is often useful when the REGEX contains slashes. 7534For instance, it is more convenient to write 's,/,-,' than 's/\//-/'. 7535 7536 Here is an example of 'sed' usage: 7537 7538 set email sed(input, 's/^<(.*)>$/\1/x') 7539 7540It removes angle quotes from the value of the 'input' variable and 7541assigns the result to 'email'. 7542 7543 To apply several s-expressions to the same input, you can either give 7544them as multiple arguments to the 'sed' function: 7545 7546 set email sed(input, 's/^<(.*)>$/\1/x', 's/(.+@)(.+)/\1\L\2\E/x') 7547 7548or give them in a single argument separated with semicolons: 7549 7550 set email sed(input, 's/^<(.*)>$/\1/x;s/(.+@)(.+)/\1\L\2\E/x') 7551 7552Both examples above remove optional angle quotes and convert the domain 7553name part to lower case. 7554 7555 Regular expressions used in 'sed' arguments are controlled by the 7556'#pragma regex', as another expressions used throughout the MFL source 7557file. To avoid using the 'x' modifier in the above example, one can 7558write: 7559 7560 #pragma regex +extended 7561 set email sed(input, 's/^<(.*)>$/\1/', 's/(.+@)(.+)/\1\L\2\E/') 7562 7563 *Note regex::, for details about that '#pragma'. 7564 7565 So far all examples used constant s-expressions. However, this is 7566not a requirement. If necessary, the expression can be stored in a 7567variable or even constructed on the fly before passing it as argument to 7568'sed'. For example, assume that you wish to remove the domain part from 7569the value, but only if that part matches one of predefined domains. Let 7570a regular expression that matches these domains be stored in the 7571variable 'domain_rx'. Then this can be done as follows: 7572 7573 set email sed(input, "s/(.+)(@%domain_rx)/\1/") 7574 7575 If the constructed regular expression uses variables whose value 7576should be matched exactly, such variables must be quoted before being 7577used as part of the regexp. Mailfromd provides a convenience function 7578for this: 7579 7580 -- Built-in Function: string qr (string STR[; string DELIM]) 7581 Quote the string STR as a regular expression. This function 7582 selects the characters to be escaped using the currently selected 7583 regular expression flavor (*note regex::). At most two additional 7584 characters that must be escaped can be supplied in the DELIM 7585 optional parameter. For example, to quote the variable 'x' for use 7586 in double-quoted s-expression: 7587 7588 qr(x, '/"') 7589 7590 7591File: mailfromd.info, Node: String manipulation, Next: String formatting, Prev: String transformation, Up: Library 7592 75935.3 String Manipulation Functions 7594================================= 7595 7596 -- Built-in Function: string escape (string STR, [string CHARS]) 7597 Returns a copy of STR with the characters from CHARS escaped, i.e. 7598 prefixed with a backslash. If CHARS is not specified, '\"' is 7599 assumed. 7600 7601 escape('"a\tstr"ing') => '\"a\\tstr\"ing' 7602 escape('new "value"', '\" ') => 'new\ \"value\"' 7603 7604 -- Built-in Function: string unescape (string STR) 7605 Performs the reverse to 'escape', i.e. removes any prefix 7606 backslash characters. 7607 7608 unescape('a \"quoted\" string') => 'a "quoted" string' 7609 7610 -- Built-in Function: string unescape (string STR, [string CHARS]) 7611 7612 -- Built-in Function: string domainpart (string STR) 7613 Returns the domain part of STR, if it is a valid email address, 7614 otherwise returns STR itself. 7615 7616 domainpart("gray") => "gray" 7617 domainpart("gray@gnu.org.ua") => "gnu.org.ua" 7618 7619 -- Built-in Function: number index (string S, string T) 7620 -- Built-in Function: number index (string S, string T, number START) 7621 Returns the index of the first occurrence of the string T in the 7622 string S, or -1 if T is not present. 7623 7624 index("string of rings", "ring") => 2 7625 7626 Optional argument START, if supplied, indicates the position in 7627 string where to start searching. 7628 7629 index("string of rings", "ring", 3) => 10 7630 7631 To find the last occurrence of a substring, use the function RINDEX 7632 (*note rindex::). 7633 7634 -- Built-in Function: number interval (string STR) 7635 Converts STR, which should be a valid time interval specification 7636 (*note time interval specification::), to seconds. 7637 7638 -- Built-in Function: number length (string STR) 7639 Returns the length of the string STR in bytes. 7640 7641 length("string") => 6 7642 7643 -- Built-in Function: string dequote (string STR) 7644 Removes '<' and '>' surrounding STR. If STR is not enclosed by 7645 angle brackets or these are unbalanced, the argument is returned 7646 unchanged: 7647 7648 dequote("<root@gnu.org.ua>") => "root@gnu.org.ua" 7649 dequote("root@gnu.org.ua") => "root@gnu.org.ua" 7650 dequote("there>") => "there>" 7651 7652 -- Built-in Function: string localpart (string STR) 7653 Returns the local part of STR if it is a valid email address, 7654 otherwise returns STR unchanged. 7655 7656 localpart("gray") => "gray" 7657 localpart("gray@gnu.org.ua") => "gray" 7658 7659 -- Built-in Function: string replstr (string S, number N) 7660 Replicate a string, i.e. return a string, consisting of S repeated 7661 N times: 7662 7663 replstr("12", 3) => "121212" 7664 7665 -- Built-in Function: string revstr (string S) 7666 Returns the string composed of the characters from S in reversed 7667 order: 7668 7669 revstr("foobar") => "raboof" 7670 7671 -- Built-in Function: number rindex (string S, string T) 7672 -- Built-in Function: number rindex (string S, string T, number START) 7673 7674 Returns the index of the last occurrence of the string T in the 7675 string S, or -1 if T is not present. 7676 7677 rindex("string of rings", "ring") => 10 7678 7679 Optional argument START, if supplied, indicates the position in 7680 string where to start searching. E.g.: 7681 7682 rindex("string of rings", "ring", 10) => 2 7683 7684 See also *note 'index' built-in function: index-built-in. 7685 7686 -- Built-in Function: string substr (string STR, number START) 7687 -- Built-in Function: string substr (string STR, number START, number 7688 LENGTH) 7689 7690 Returns the at most LENGTH-character substring of STR starting at 7691 START. If LENGTH is omitted, the rest of STR is used. 7692 7693 If LENGTH is greater than the actual length of the string, the 7694 'e_range' exception is signalled. 7695 7696 substr("mailfrom", 4) => "from" 7697 substr("mailfrom", 4, 2) => "fr" 7698 7699 -- Built-in Function: string substring (string STR, number START, 7700 number END) 7701 Returns a substring of STR between offsets START and END, 7702 inclusive. Negative END means offset from the end of the string. 7703 In other words, yo obtain a substring from START to the end of the 7704 string, use 'substring(STR, START, -1)': 7705 7706 substring("mailfrom", 0, 3) => "mail" 7707 substring("mailfrom", 2, 5) => "ilfr" 7708 substring("mailfrom", 4, -1) => "from" 7709 substring("mailfrom", 4, length("mailfrom") - 1) => "from" 7710 substring("mailfrom", 4, -2) => "fro" 7711 7712 This function signals 'e_range' exception if either START or END 7713 are outside the string length. 7714 7715 -- Built-in Function: string tolower (string STR) 7716 7717 Returns a copy of the string STR, with all the upper-case 7718 characters translated to their corresponding lower-case 7719 counterparts. Non-alphabetic characters are left unchanged. 7720 7721 tolower("MAIL") => "mail" 7722 7723 -- Built-in Function: string toupper (string STR) 7724 Returns a copy of the string STR, with all the lower-case 7725 characters translated to their corresponding upper-case 7726 counterparts. Non-alphabetic characters are left unchanged. 7727 7728 toupper("mail") => "MAIL" 7729 7730 -- Built-in Function: string ltrim (string STR[, string CSET) 7731 Returns a copy of the input string STR with any leading characters 7732 present in CSET removed. If the latter is not given, white space 7733 is removed (spaces, tabs, newlines, carriage returns, and line 7734 feeds). 7735 7736 ltrim(" a string") => "a string" 7737 ltrim("089", "0") => "89" 7738 7739 Note the last example. It shows how 'ltrim' can be used to convert 7740 decimal numbers in string representation that begins with '0'. 7741 Normally such strings will be treated as representing octal 7742 numbers. If they are indeed decimal, use 'ltrim' to strip off the 7743 leading zeros, e.g.: 7744 7745 set dayofyear ltrim(strftime('%j', time()), "0") 7746 7747 -- Built-in Function: string rtrim (string STR[, string CSET) 7748 Returns a copy of the input string STR with any trailing characters 7749 present in CSET removed. If the latter is not given, white space 7750 is removed (spaces, tabs, newlines, carriage returns, and line 7751 feeds). 7752 7753 -- Built-in Function: number vercmp (string A, string B) 7754 Compares two strings as 'mailfromd' version numbers. The result is 7755 negative if B precedes A, zero if they refer to the same version, 7756 and positive if B follows A: 7757 7758 vercmp("5.0", "5.1") => 1 7759 vercmp("4.4", "4.3") => -1 7760 vercmp("4.3.1", "4.3") => -1 7761 vercmp("8.0", "8.0") => 0 7762 7763 -- Library Function: string sa_format_score (number CODE, number PREC) 7764 Format CODE as a floating-point number with PREC decimal digits: 7765 7766 sa_format_score(5000, 3) => "5.000" 7767 7768 This function is convenient for formatting SpamAssassin scores for 7769 use in message headers and textual reports. It is defined in 7770 module 'sa.mf'. 7771 7772 *Note SpamAssassin: sa, for examples of its use. 7773 7774 -- Library Function: string sa_format_report_header (string TEXT) 7775 Format a SpamAssassin report text in order to include it in a RFC 7776 822 header. This function selects the score listing from TEXT, and 7777 prefixes each line with '* '. Its result looks like: 7778 7779 * 0.2 NO_REAL_NAME From: does not include a real name 7780 * 0.1 HTML_MESSAGE BODY: HTML included in message 7781 7782 *Note SpamAssassin: sa, for examples of its use. 7783 7784 -- Library Function: string strip_domain_part (string DOMAIN, number N) 7785 7786 Returns at most N last components of the domain name DOMAIN. If N 7787 is 0 the function returns DOMAIN. 7788 7789 This function is defined in the module 'strip_domain_part.mf' 7790 (*note Modules::). 7791 7792 Examples: 7793 7794 require strip_domain_part 7795 strip_domain_part("puszcza.gnu.org.ua", 2) => "org.ua" 7796 strip_domain_part("puszcza.gnu.org.ua", 0) => "puszcza.gnu.org.ua" 7797 7798 -- Library Function: boolean is_ip (string STR) 7799 7800 Returns 'true' if STR is a valid IPv4 address. This function is 7801 defined in the module 'is_ip.mf' (*note Modules::). 7802 7803 For example: 7804 7805 require is_ip 7806 7807 is_ip("1.2.3.4") => 1 7808 is_ip("1.2.3.x") => 0 7809 is_ip("blah") => 0 7810 is_ip("255.255.255.255") => 1 7811 is_ip("0.0.0.0") => 1 7812 7813 -- Library Function: string revip (string IP) 7814 7815 Reverses octets in IP, which must be a valid string representation 7816 of an IPv4 address. 7817 7818 Example: 7819 7820 'revip("127.0.0.1") => "1.0.0.127"' 7821 7822 -- Library Function: string verp_extract_user (string EMAIL, string 7823 DOMAIN) 7824 7825 If EMAIL is a valid VERP-style email address for DOMAIN, this 7826 function returns the user name, corresponding to that email. 7827 Otherwise, it returns empty string. 7828 7829 verp_extract_user("gray=gnu.org.ua@tuhs.org", 'gnu\..*') 7830 => "gray" 7831 7832 7833File: mailfromd.info, Node: String formatting, Next: Character Type, Prev: String manipulation, Up: Library 7834 78355.4 String formatting 7836===================== 7837 7838 -- Built-in Function: string sprintf (string FORMAT, ...) 7839 The function 'sprintf' formats its argument according to FORMAT 7840 (see below) and returns the resulting string. It takes varying 7841 number of parameters, the only mandatory one being FORMAT. 7842 7843Format string 7844------------- 7845 7846The format string is a simplified version of the format argument to C 7847'printf'-family functions. 7848 7849 The format string is composed of zero or more "directives": ordinary 7850characters (not '%'), which are copied unchanged to the output stream; 7851and "conversion specifications", each of which results in fetching zero 7852or more subsequent arguments. Each conversion specification is 7853introduced by the character '%', and ends with a conversion specifier. 7854In between there may be (in this order) zero or more "flags", an 7855optional "minimum field width", and an optional "precision". 7856 7857 Notice, that in practice that means that you should use single quotes 7858with the FORMAT arguments, to protect conversion specifications from 7859being recognized as variable references (*note singe-vs-double::). 7860 7861 No type conversion is done on arguments, so it is important that the 7862supplied arguments match their corresponding conversion specifiers. By 7863default, the arguments are used in the order given, where each '*' and 7864each conversion specifier asks for the next argument. If insufficiently 7865many arguments are given, 'sprintf' raises 'e_range' exception. One can 7866also specify explicitly which argument is taken, at each place where an 7867argument is required, by writing '%M$', instead of '%' and '*M$' instead 7868of '*', where the decimal integer M denotes the position in the argument 7869list of the desired argument, indexed starting from 1. Thus, 7870 7871 sprintf('%*d', width, num); 7872and 7873 sprintf('%2$*1$d', width, num); 7874are equivalent. The second style allows repeated references to the same 7875argument. 7876 7877Flag characters 7878--------------- 7879 7880The character '%' is followed by zero or more of the following "flags": 7881 7882'#' 7883 The value should be converted to an "alternate form". For 'o' 7884 conversions, the first character of the output string is made zero 7885 (by prefixing a '0' if it was not zero already). For 'x' and 'X' 7886 conversions, a non-zero result has the string '0x' (or '0X' for 'X' 7887 conversions) prepended to it. Other conversions are not affected 7888 by this flag. 7889 7890'0' 7891 The value should be zero padded. For 'd', 'i', 'o', 'u', 'x', and 7892 'X' conversions, the converted value is padded on the left with 7893 zeros rather than blanks. If the '0' and '-' flags both appear, 7894 the '0' flag is ignored. If a precision is given, the '0' flag is 7895 ignored. Other conversions are not affected by this flag. 7896 7897'-' 7898 The converted value is to be left adjusted on the field boundary. 7899 (The default is right justification.) The converted value is 7900 padded on the right with blanks, rather than on the left with 7901 blanks or zeros. A '-' overrides a '0' if both are given. 7902 7903'' ' (a space)' 7904 A blank should be left before a positive number (or empty string) 7905 produced by a signed conversion. 7906 7907'+' 7908 A sign ('+' or '-') always be placed before a number produced by a 7909 signed conversion. By default a sign is used only for negative 7910 numbers. A '+' overrides a space if both are used. 7911 7912Field width 7913----------- 7914 7915An optional decimal digit string (with nonzero first digit) specifying a 7916minimum field width. If the converted value has fewer characters than 7917the field width, it will be padded with spaces on the left (or right, if 7918the left-adjustment flag has been given). Instead of a decimal digit 7919string one may write '*' or '*M$' (for some decimal integer M) to 7920specify that the field width is given in the next argument, or in the 7921M-th argument, respectively, which must be of numeric type. A negative 7922field width is taken as a '-' flag followed by a positive field width. 7923In no case does a non-existent or small field width cause truncation of 7924a field; if the result of a conversion is wider than the field width, 7925the field is expanded to contain the conversion result. 7926 7927Precision 7928--------- 7929 7930An optional precision, in the form of a period ('.') followed by an 7931optional decimal digit string. Instead of a decimal digit string one 7932may write '*' or '*M$' (for some decimal integer M) to specify that the 7933precision is given in the next argument, or in the M-th argument, 7934respectively, which must be of numeric type. If the precision is given 7935as just '.', or the precision is negative, the precision is taken to be 7936zero. This gives the minimum number of digits to appear for 'd', 'i', 7937'o', 'u', 'x', and 'X' conversions, or the maximum number of characters 7938to be printed from a string for the 's' conversion. 7939 7940Conversion specifier 7941-------------------- 7942 7943A character that specifies the type of conversion to be applied. The 7944conversion specifiers and their meanings are: 7945 7946d 7947i 7948 The numeric argument is converted to signed decimal notation. The 7949 precision, if any, gives the minimum number of digits that must 7950 appear; if the converted value requires fewer digits, it is padded 7951 on the left with zeros. The default precision is '1'. When '0' is 7952 printed with an explicit precision '0', the output is empty. 7953 7954o 7955u 7956x 7957X 7958 The numeric argument is converted to unsigned octal ('o'), unsigned 7959 decimal ('u'), or unsigned hexadecimal ('x' and 'X') notation. The 7960 letters 'abcdef' are used for 'x' conversions; the letters 'ABCDEF' 7961 are used for 'X' conversions. The precision, if any, gives the 7962 minimum number of digits that must appear; if the converted value 7963 requires fewer digits, it is padded on the left with zeros. The 7964 default precision is '1'. When '0' is printed with an explicit 7965 precision 0, the output is empty. 7966 7967s 7968 The string argument is written to the output. If a precision is 7969 specified, no more than the number specified of characters are 7970 written. 7971 7972% 7973 A '%' is written. No argument is converted. The complete 7974 conversion specification is '%%'. 7975 7976 7977File: mailfromd.info, Node: Character Type, Next: Email processing functions, Prev: String formatting, Up: Library 7978 79795.5 Character Type 7980================== 7981 7982These functions check whether all characters of STR fall into a certain 7983character class according to the 'C' ('POSIX') locale(1). 'True' (1) is 7984returned if they do, 'false' (0) is returned otherwise. In the latter 7985case, the global variable 'ctype_mismatch' is set to the index of the 7986first character that is outside of the character class (characters are 7987indexed from 0). 7988 7989 -- Built-in Function: boolean isalnum (string STR) 7990 Checks for alphanumeric characters: 7991 7992 isalnum("a123") => 1 7993 isalnum("a.123") => 0 (ctype_mismatch = 1) 7994 7995 -- Built-in Function: boolean isalpha (string STR) 7996 Checks for an alphabetic character: 7997 7998 isalnum("abc") => 1 7999 isalnum("a123") => 0 8000 8001 -- Built-in Function: boolean isascii (string STR) 8002 Checks whether all characters in STR are 7-bit ones, that fit into 8003 the ASCII character set. 8004 8005 isascii("abc") => 1 8006 isascii("ab\0200") => 0 8007 8008 -- Built-in Function: boolean isblank (string STR) 8009 Checks if STR contains only blank characters; that is, spaces or 8010 tabs. 8011 8012 -- Built-in Function: boolean iscntrl (string STR) 8013 Checks for control characters. 8014 8015 -- Built-in Function: boolean isdigit (string STR) 8016 Checks for digits (0 through 9). 8017 8018 -- Built-in Function: boolean isgraph (string STR) 8019 Checks for any printable characters except spaces. 8020 8021 -- Built-in Function: boolean islower (string STR) 8022 Checks for lower-case characters. 8023 8024 -- Built-in Function: boolean isprint (string STR) 8025 Checks for printable characters including space. 8026 8027 -- Built-in Function: boolean ispunct (string STR) 8028 Checks for any printable characters which are not a spaces or 8029 alphanumeric characters. 8030 8031 -- Built-in Function: boolean isspace (string STR) 8032 Checks for white-space characters, i.e.: space, form-feed ('\f'), 8033 newline ('\n'), carriage return ('\r'), horizontal tab ('\t'), and 8034 vertical tab ('\v'). 8035 8036 -- Built-in Function: boolean isupper (string STR) 8037 Checks for uppercase letters. 8038 8039 -- Built-in Function: boolean isxdigit (string STR) 8040 Checks for hexadecimal digits, i.e. one of '0', '1', '2', '3', 8041 '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f', 'A', 8042 'B', 'C', 'D', 'E', 'F'. 8043 8044 ---------- Footnotes ---------- 8045 8046 (1) Support for other locales is planned for future versions. 8047 8048 8049File: mailfromd.info, Node: Email processing functions, Next: Envelope modification functions, Prev: Character Type, Up: Library 8050 80515.6 Email processing functions. 8052=============================== 8053 8054 -- Built-in Function: number email_map (string EMAIL) 8055 Parses EMAIL and returns a bitmap, consisting of zero or more of 8056 the following flags: 8057 8058 'EMAIL_MULTIPLE' 8059 EMAIL has more than one email address. 8060 8061 'EMAIL_COMMENTS' 8062 EMAIL has comment parts. 8063 8064 'EMAIL_PERSONAL' 8065 EMAIL has personal part. 8066 8067 'EMAIL_LOCAL' 8068 EMAIL has local part. 8069 8070 'EMAIL_DOMAIN' 8071 EMAIL has domain part. 8072 8073 'EMAIL_ROUTE' 8074 EMAIL has route part. 8075 8076 These constants are declared in the 'email.mf' module. The 8077 function 'email_map' returns 0 if its argument is not a valid email 8078 address. 8079 8080 -- Library Function: boolean email_valid (string EMAIL) 8081 Returns 'True' (1) if EMAIL is a valid email address, consisting of 8082 local and domain parts only. E.g.: 8083 8084 email_valid("gray@gnu.org") => 1 8085 email_valid("gray") => 0 8086 email_valid('"Sergey Poznyakoff <gray@gnu.org>') => 0 8087 8088 This function is defined in 'email.mf' (*note Modules::). 8089 8090 8091File: mailfromd.info, Node: Envelope modification functions, Next: Header modification functions, Prev: Email processing functions, Up: Library 8092 80935.7 Envelope Modification Functions 8094=================================== 8095 8096Envelope modification functions set sender and add or delete recipient 8097addresses from the message envelope. This allows MFL scripts to 8098redirect messages to another addresses. 8099 8100 -- Built-in Function: void set_from (string EMAIL [, string ARGS]) 8101 Sets envelope sender address to EMAIL, which must be a valid email 8102 address. Optional ARGS supply arguments to ESMTP 'MAIL FROM' 8103 command. 8104 8105 -- Built-in Function: void rcpt_add (string ADDRESS) 8106 Add the e-mail ADDRESS to the envelope. 8107 8108 -- Built-in Function: void rcpt_delete (string ADDRESS) 8109 Remove ADDRESS from the envelope. 8110 8111 The following example code uses these functions to implement a simple 8112alias-like capability: 8113 8114 prog envrcpt 8115 do 8116 string alias dbget(aliasdb, $1, "NULL", 1) 8117 if alias != "NULL" 8118 rcpt_delete($1) 8119 rcpt_add(alias) 8120 fi 8121 done 8122 8123 8124File: mailfromd.info, Node: Header modification functions, Next: Body Modification Functions, Prev: Envelope modification functions, Up: Library 8125 81265.8 Header Modification Functions 8127================================= 8128 8129There are two ways to modify message headers in a MFL script. First is 8130to use header actions, described in *note Actions::, and the second way 8131is to use message modification functions. Compared with the actions, 8132the functions offer a series of advantages. For example, using 8133functions you can construct the name of the header to operate upon (e.g. 8134by concatenating several arguments), something which is impossible when 8135using actions. Moreover, apart from three basic operations (add, modify 8136and remove), as supported by header actions, header functions allow to 8137insert a new header into a particular place. 8138 8139 -- Built-in Function: void header_add (string NAME, string VALUE) 8140 Adds a header 'NAME: VALUE' to the message. 8141 8142 In contrast to the 'add' action, this function allows to construct 8143 the header name using arbitrary MFL expressions. 8144 8145 -- Built-in Function: void header_add (string NAME, string VALUE, 8146 number IDX) 8147 This syntax is preserved for backward compatibility. It is 8148 equivalent to 'header_insert', which see. 8149 8150 -- Built-in Function: void header_insert (string NAME, string VALUE, 8151 number IDX) 8152 This function inserts a header 'NAME: 'value'' at IDXth header 8153 position in the internal list of headers maintained by the MTA. 8154 That list contains headers added to the message either by the 8155 filter or by the MTA itself, but not the headers included in the 8156 message itself. Some of the headers in this list are conditional, 8157 e.g. the ones added by the 'H?COND?' directive in 'sendmail.cf'. 8158 MTA evaluates them after all header modifications have been done 8159 and removes those of headers for which they yield false. This 8160 means that the position at which the header added by 8161 'header_insert' will appear in the final message will differ from 8162 IDX. 8163 8164 -- Built-in Function: void header_delete (string NAME [, number INDEX]) 8165 Delete header NAME from the envelope. If INDEX is given, delete 8166 INDEXth instance of the header NAME. 8167 8168 Notice the differences between this function and the 'delete' 8169 action: 8170 8171 1. It allows to construct the header name, whereas 'delete' 8172 requires it to be a literal string. 8173 8174 2. Optional INDEX argument allows to select a particular header 8175 instance to delete. 8176 8177 -- Built-in Function: void header_replace (string NAME, string VALUE [, 8178 number INDEX]) 8179 Replace the value of the header NAME with VALUE. If INDEX is 8180 given, replace INDEXth instance of header NAME. 8181 8182 Notice the differences between this function and the 'replace' 8183 action: 8184 8185 1. It allows to construct the header name, whereas 'replace' 8186 requires it to be a literal string. 8187 8188 2. Optional INDEX argument allows to select a particular header 8189 instance to replace. 8190 8191 -- Library Function: void header_rename (string NAME, string NEWNAME[, 8192 number IDX]) 8193 8194 Defined in the module 'header_rename.mf'. 8195 Available only in the 'eom' handler. 8196 8197 Renames the IDXth instance of header NAME to NEWNAME. If IDX is 8198 not given, assumes 1. 8199 8200 If the specified header or the IDX instance of it is not present in 8201 the current message, the function silently returns. All other 8202 errors cause run-time exception. 8203 8204 The position of the renamed header in the header list is not 8205 preserved. 8206 8207 The example below renames 'Subject' header to 'X-Old-Subject': 8208 8209 require 'header_rename' 8210 8211 prog eom 8212 do 8213 header_rename("Subject", "X-Old-Subject") 8214 done 8215 8216 -- Library Function: void header_prefix_all (string NAME [, string 8217 PREFIX]) 8218 8219 Defined in the module 'header_rename.mf'. 8220 Available only in the 'eom' handler. 8221 8222 Renames all headers named NAME by prefixing them with PREFIX. If 8223 PREFIX is not supplied, removes all such headers. 8224 8225 All renamed headers will be placed in a continuous block in the 8226 header list. The absolute position in the header list will change. 8227 Relative ordering of renamed headers will be preserved. 8228 8229 -- Library Function: void header_prefix_pattern (string PATTERN, string 8230 PREFIX) 8231 8232 Defined in the module 'header_rename.mf'. 8233 Available only in the 'eom' handler. 8234 8235 Renames all headers with names matching PATTERN (in the sense of 8236 'fnmatch', *note fnmatches: Special comparisons.) by prefixing them 8237 with PREFIX. 8238 8239 All renamed headers will be placed in a continuous block in the 8240 header list. The absolute position in the header list will change. 8241 Relative ordering of renamed headers will be preserved. 8242 8243 If called with one argument, removes all headers matching PATTERN. 8244 8245 For example, to prefix all headers beginning with 'X-Spamd-' with 8246 an additional 'X-': 8247 8248 require 'header_rename' 8249 8250 prog eom 8251 do 8252 header_prefix_pattern("X-Spamd-*", "X-") 8253 done 8254 8255 8256File: mailfromd.info, Node: Body Modification Functions, Next: Message modification queue, Prev: Header modification functions, Up: Library 8257 82585.9 Body Modification Functions 8259=============================== 8260 8261Body modification is an experimental feature of MFL. The version 8.10 8262provides only one function for that purpose. 8263 8264 -- Built-in Function: void replbody (string TEXT) 8265 Replace the body of the message with TEXT. Notice, that TEXT must 8266 not contain RFC 822 headers. See the previous section if you want 8267 to manipulate message headers. 8268 8269 Example: 8270 8271 replbody("Body of this message has been removed by the mail filter.") 8272 8273 No restrictions are imposed on the format of TEXT. 8274 8275 -- Built-in Function: void replbody_fd (number FD) 8276 Replaces the body of the message with the content of the stream FD. 8277 Use this function if the body is very big, or if it is returned by 8278 an external program. 8279 8280 Notice that this function starts reading from the current position 8281 in FD. Use 'rewind' if you wish to read from the beginning of the 8282 stream. 8283 8284 The example below shows how to preprocess the body of the message 8285 using external program '/usr/bin/mailproc', which is supposed to 8286 read the body from its standard input and write the processed text 8287 to its standard output: 8288 8289 number fd # Temporary file descriptor 8290 8291 prog data 8292 do 8293 # Open the temporary file 8294 set fd tempfile() 8295 done 8296 8297 prog body 8298 do 8299 # Write the body to it. 8300 write_body(fd, $1, $2) 8301 done 8302 8303 prog eom 8304 do 8305 # Use the resulting stream as the stdin to the mailproc 8306 # command and read the new body from its standard output. 8307 rewind(fd) 8308 replbody_fd(spawn("</usr/bin/mailproc", fd)) 8309 done 8310 8311