1Distribution of Milter responsibility 2===================================== 3 4Milters look at the SMTP commands as well as the message content. 5In Postfix these are handled by different processes: 6 7- smtpd(8) (the SMTP server) focuses on the SMTP commands, strips 8 the SMTP encapsulation, and passes envelope information and message 9 content to the cleanup server. 10 11- the cleanup(8) server parses the message content (it understands 12 headers, body, and MIME structure), and creates a queue file with 13 envelope and content information. The cleanup server adds additional 14 envelope records, such as when to send a "delayed mail" notice. 15 16If we want to support message modifications (add/delete recipient, 17add/delete/replace header, replace body) then it pretty much has 18to be implemented in the cleanup server, if we want to avoid extra 19temporary files. 20 21Network versus local submission 22=============================== 23 24As of Sendmail 8.12, all mail is received via SMTP, so all mail is 25subject to Miltering (local submissions are queued in a submission 26queue and then delivered via SMTP to the main MTA, or appended to 27$HOME/dead.letter). In Postfix, local submissions are received by 28the pickup server, which feeds the mail into the cleanup server 29after doing basic sanity checks. 30 31How do we set up the Milters with SMTP mail versus local submissions? 32 33- SMTP mail: smtpd creates Milter contexts, and sends them, including 34 their sockets, to the cleanup server. The smtpd is responsible 35 for sending the Milter abort and close messages. Both smtpd and 36 cleanup are responsible for closing their Milter socket. Since 37 smtpd and cleanup inspect mail at different times, there is no 38 conflict with access to the Milter socket. 39 40- Local submission: the cleanup server creates Milter contexts. 41 The cleanup server provides dummy connect and helo information, 42 or perhaps none at all, and provides sender and recipient events. 43 The cleanup server is responsible for sending the Milter abort 44 and close messages, and for closing the Milter socket. 45 46A special case of local submission is "sendmail -t". This creates 47a record stream in which recipients appear after content. However, 48Milters expect to receive envelope information before content, not 49after. This is not a problem: just like a queue manager, the 50cleanup-side Milter client can jump around through the queue file 51and send the information to the Milter in the expected order. 52 53Interaction with XCLIENT, "postsuper -r", and external content filters 54====================================================================== 55 56Milter applications expect that the MTA supplies context information 57in the form of Sendmail-like macros (j=hostname, {client_name}=the 58SMTP client hostname, etc.). Not all these macros have a Postfix 59equivalent. Postfix 2.3 makes a subset available. 60 61If Postfix does not implement a specific macro, people can usually 62work around it. But we should avoid inconsistency. If Postfix can 63make macro X available at Milter protocol stage Y, then it must 64also be able to make that macro available at all later Milter 65protocol stages, even when some of those stages are handled by a 66different Postfix process. 67 68Thus, when adding Milter support for a specific Sendmail-like macro 69to the SMTP server: 70 71- We may have to update the XCLIENT protocol, so that Milter 72 applications can be tested with XCLIENT. If not, then we must 73 prominently document everywhere that XCLIENT does not provide 74 100% accurate simulation for Milters. An additional complication 75 is that the SMTP command length is limited, and that each XCLIENT 76 command resets the SMTP server to the 220 stage and generates 77 "connect" events for anvil(8) and for Milters. 78 79- The SMTP server has to send the corresponding attribute to the 80 cleanup server. The cleanup server then stores the attribute in 81 the queue file, so that Milters produce consistent results when 82 mail is re-queued with "postsuper -r". 83 84But wait, there is more. If mail is filtered by an external content 85filter, then it needs to preserve all the Milter attributes so that 86after "postsuper -r", Milters produce the exact same result as when 87mail was received originally by Postfix. Specifically, after 88"postsuper -r" a signing Milter must not sign mail that it did not 89sign on the first pass through Postfix, and it must not reject mail 90that it accepted on the first pass through Postfix. 91 92Instead of trying to re-create the Milter execution environment 93after "postsuper -r" we simply disable Milter processing. The 94rationale for this is: if mail was Miltered before it was written 95to queue file, then there is no need to Milter it again. 96 97We might want to take a similar approach with external (signing or 98blocking) content filters: don't filter mail that has already been 99filtered, and don't filter mail that didn't need to be filtered. 100Such mail can be recognized by the absence of a "content_filter" 101record. To make the implementation efficient, the cleanup server 102would have to record the presence of a "content_filter" record in 103the queue file header. 104 105Message envelope or content modifications 106========================================= 107 108Milters can send modification requests after receiving the end of 109the message body. If we can implement all the header/body-related 110Milter operations in the cleanup server, then we can try to edit 111the queue file in place, without ever having to make a temporary 112copy. Once a Milter is done editing, the queue file can be used as 113input for the next Milter, and so on. Finally, the cleanup server 114calls fsync() and waits for successful return. 115 116To implement in-place queue file edits, we need to introduce 117surprisingly little change to the existing Postfix queue file 118structure. All we need is a way to specify a jump from one place 119in the file to another. 120 121Postfix does not store queue files as plain text files. Instead all 122information is stored in records with an explicit type and length 123for sender, recipient, arrival time, and so on. Even the content 124that makes up the message header and body is stored as records with 125an explicit type and length. This organization makes it very easy 126to introduce pointer records, which is what we will use to jump 127from one place in a queue file to another place. 128 129- Deleting a recipient or header record is easy - just mark the 130 record as killed. When deleting a recipient, we must kill all 131 recipient records that result from virtual alias expansion of the 132 original recipient address. When deleting a very long header or 133 body line, multiple queue file records may need to be killed. We 134 won't try to reuse the deleted space for other purposes. 135 136- Replacing header or body records involves pointer records. 137 Basically, a record is replaced by overwriting it with a forward 138 pointer to space after the end of the queue file, putting the new 139 record there, followed by a reverse pointer to the record that 140 follows the replaced information. If the replaced record is shorter 141 than a pointer record, we relocate the records that follow it to 142 the new area, until we have enough space for the forward pointer 143 record. See below for a discussion on what it takes to make this 144 safe. 145 146 Postfix queue files are segmented. The first segment is for 147 envelope records, the second for message header and body content, 148 and the third segment is for information that was extracted or 149 generated from the message header and body content. Each segment 150 is terminated by a marker record. For now we don't want to change 151 their location. In particular, we want to avoid moving the start 152 of a segment. 153 154 To ensure that we can always replace a header or body record by 155 a pointer record, without having to relocate a marker record, the 156 cleanup server always places a dummy pointer record at the end 157 of the headers and at the end of the body. 158 159 When a Milter wants to replace an entire body, we have the option 160 to overwrite existing body records until we run out of space, and 161 then writing a pointer to space at the end of the queue file, 162 followed by the remainder of the body, and a pointer to the marker 163 that ends the message content segment. 164 165- Appending a recipient or header record involves pointer records 166 as well. This requires that the queue file already contains a 167 dummy pointer record at the place where we want to append recipient 168 or header content (Milters currently do not replace individual 169 body records, but we could add this if need be). To append, 170 change the dummy pointer into a forward pointer to space after 171 the end of a message, put the new record there, followed by a 172 reverse pointer to the record that follows the forward pointer. 173 174 To append another record, replace the reverse pointer by a forward 175 pointer to space after the end of a message, put the new record 176 there, followed by the value of the reverse pointer that we 177 replace. Thus, there is no one-to-one correspondence between 178 forward and backward pointers! In fact, there can be multiple 179 forward pointers for one reverse pointer. 180 181When relocating a record we must not relocate the target of a jump 182================================================================== 183 184As discussed above, when replacing an existing record, we overwrite 185it with a forward pointer to the new information. If the old record 186is too small we relocate one or more records that follow the record 187that's being replaced, until we have enough space for the forward 188pointer record. 189 190Now we have to become really careful. Could we end up relocating a 191record that is the target of a forward or reverse pointer, and thus 192corrupt the queue file? The answer is NO. 193 194- We never relocate end-of-segment marker records. Instead, the 195 cleanup server writes dummy pointer records to guarantee that 196 there is always space for a pointer. 197 198- When a record is the target of a forward pointer, it is "edited" 199 information that is preceded either by the end-of-queue-file 200 marker record, or it is preceded by the reverse pointer at the 201 end of earlier written "edited" information. Thus, the target of 202 a forward pointer will not be relocated to make space for a pointer 203 record. 204 205- When a record is the target of a reverse pointer, it is always 206 preceded by a forward pointer record (or by a forward pointer 207 record followed by some unused space). Thus, the target of a 208 reverse pointer will not be relocated to make space for a pointer 209 record. 210 211Could we end up relocating a pointer record? Yes, but that is OK, 212as long as pointers contain absolute offsets. 213 214Pointer records introduce the possibility of loops 215================================================== 216 217When a queue file is damaged, a bogus pointer value may send Postfix 218into a loop. This must not happen. 219 220Detecting loops is not trivial: 221 222- A sequence of multiple forward pointers may be followed by one 223 legitimate reverse pointer to the location after the first forward 224 pointer. See above for a discussion of how to append a record to 225 an appended record. 226 227- We do know, however, that there will not be more reverse pointers 228 than forward pointers. But this does not help much. 229 230Perhaps we can include a record count at the start of the queue 231file, so that the record walking code knows that it's looking at 232some records more than once, and return an error indication. 233 234How many bytes do we need for a pointer record? 235=============================================== 236 237A pointer record would look like this: 238 239 type (1 byte) 240 offset (see below) 241 242Postfix uses long for queue file size/offset information, and stores 243them as %15ld in the SIZE record at the start of the queue file. 244This is somewhat less than a 64-bit long, but it is enough for a 245some time to come, and it is easily changed without breaking forward 246or backward compatibility. 247 248It does mean, however, that a pointer record can easily exceed the 249length of a header record. This is why we go through the trouble 250of record relocation and dummy records. 251 252In Postfix 2.4 we fixed this by adding padding to short message 253header records so that we can always write a pointer record over a 254message header. This immensly simplifies the code. 255