1===============
2Header matching
3===============
4
5Mailman can do pattern based header matching during its normal rule
6processing.  There is a set of site-wide default header matches specified in
7the configuration file under the `[antispam]` section.
8
9    >>> mlist = create_list('test@example.com')
10
11In this section, the variable `header_checks` contains a list of the headers
12to check, and the patterns to check them against.  By default, this list is
13empty.
14
15It is also possible to programmatically extend these header checks.  Here,
16we'll extend the checks with a pattern that matches 4 or more stars.
17
18    >>> chain = config.chains['header-match']
19    >>> chain.extend('x-spam-score', '[*]{4,}')
20
21First, if the message has no ``X-Spam-Score:`` header, the message passes
22through the chain with no matches.
23
24    >>> msg = message_from_string("""\
25    ... From: aperson@example.com
26    ... To: test@example.com
27    ... Subject: Not spam
28    ... Message-ID: <ant>
29    ...
30    ... This is a message.
31    ... """)
32
33.. Function to help with printing rule hits and misses.
34    >>> def hits_and_misses(msgdata):
35    ...     hits = msgdata.get('rule_hits', [])
36    ...     if len(hits) == 0:
37    ...         print('No rules hit')
38    ...     else:
39    ...         print('Rule hits:')
40    ...         for rule_name in hits:
41    ...             rule = config.rules[rule_name]
42    ...             print('    {}: {}'.format(rule.header, rule.pattern))
43    ...     misses = msgdata.get('rule_misses', [])
44    ...     if len(misses) == 0:
45    ...         print('No rules missed')
46    ...     else:
47    ...         print('Rule misses:')
48    ...         for rule_name in misses:
49    ...             rule = config.rules[rule_name]
50    ...             print('    {}: {}'.format(rule.header, rule.pattern))
51
52By looking at the message metadata after chain processing, we can see that
53none of the rules matched.
54
55    >>> from mailman.core.chains import process
56    >>> msgdata = {}
57    >>> process(mlist, msg, msgdata, 'header-match')
58    >>> hits_and_misses(msgdata)
59    No rules hit
60    Rule misses:
61        x-spam-score: [*]{4,}
62
63The header may exist but does not match the pattern.
64
65    >>> msg['X-Spam-Score'] = '***'
66    >>> msgdata = {}
67    >>> process(mlist, msg, msgdata, 'header-match')
68    >>> hits_and_misses(msgdata)
69    No rules hit
70    Rule misses:
71        x-spam-score: [*]{4,}
72
73The header may exist and match the pattern.  By default, when the header
74matches, it gets held for moderator approval.
75::
76
77    >>> from mailman.interfaces.chain import ChainEvent
78    >>> from mailman.testing.helpers import event_subscribers
79    >>> def handler(event):
80    ...     if isinstance(event, ChainEvent):
81    ...         print(event.__class__.__name__,
82    ...               event.chain.name, event.msg['message-id'])
83
84    >>> del msg['x-spam-score']
85    >>> msg['X-Spam-Score'] = '*****'
86    >>> msgdata = {}
87    >>> with event_subscribers(handler):
88    ...     process(mlist, msg, msgdata, 'header-match')
89    HoldEvent hold <ant>
90
91    >>> hits_and_misses(msgdata)
92    Rule hits:
93        x-spam-score: [*]{4,}
94    No rules missed
95
96The configuration file can also specify a different final disposition for
97messages that match their header checks.  For example, we may just want to
98discard such messages.
99
100    >>> from mailman.testing.helpers import configuration
101    >>> msgdata = {}
102    >>> with event_subscribers(handler):
103    ...     with configuration('antispam', jump_chain='discard'):
104    ...         process(mlist, msg, msgdata, 'header-match')
105    DiscardEvent discard <ant>
106
107These programmatically added headers can be removed by flushing the chain.
108Now, nothing with match this message.
109
110    >>> chain.flush()
111    >>> msgdata = {}
112    >>> process(mlist, msg, msgdata, 'header-match')
113    >>> hits_and_misses(msgdata)
114    No rules hit
115    No rules missed
116
117
118List-specific header matching
119=============================
120
121Each mailing list can also be configured with a set of header matching regular
122expression rules.  These can be used to impose list-specific header filtering
123with the same semantics as the global ``[antispam]`` section, or to have a
124different action.
125
126To follow the global antispam action, the header match rule must not specify a
127``chain`` to jump to.  If the default antispam action is changed in the
128configuration file and Mailman is restarted, those rules will get the new jump
129action.
130
131The list administrator wants to match not on four stars, but on three plus
132signs, but only for the current mailing list.
133
134    >>> from mailman.interfaces.mailinglist import IHeaderMatchList
135    >>> header_matches = IHeaderMatchList(mlist)
136    >>> header_matches.append('x-spam-score', '[+]{3,}')
137
138A message with a spam score of two pluses does not match.
139
140    >>> msgdata = {}
141    >>> del msg['x-spam-score']
142    >>> msg['X-Spam-Score'] = '++'
143    >>> process(mlist, msg, msgdata, 'header-match')
144    >>> hits_and_misses(msgdata)
145    No rules hit
146    Rule misses:
147        x-spam-score: [+]{3,}
148
149But a message with a spam score of three pluses does match.  Because a message
150with the previous ``Message-Id`` is already in the moderation queue, we need
151to give this message a new ``Message-Id``.
152
153    >>> msgdata = {}
154    >>> del msg['x-spam-score']
155    >>> msg['X-Spam-Score'] = '+++'
156    >>> del msg['message-id']
157    >>> msg['Message-Id'] = '<bee>'
158    >>> process(mlist, msg, msgdata, 'header-match')
159    >>> hits_and_misses(msgdata)
160    Rule hits:
161        x-spam-score: [+]{3,}
162    No rules missed
163
164As does a message with a spam score of four pluses.
165
166    >>> msgdata = {}
167    >>> del msg['x-spam-score']
168    >>> msg['X-Spam-Score'] = '++++'
169    >>> del msg['message-id']
170    >>> msg['Message-Id'] = '<cat>'
171    >>> process(mlist, msg, msgdata, 'header-match')
172    >>> hits_and_misses(msgdata)
173    Rule hits:
174        x-spam-score: [+]{3,}
175    No rules missed
176
177Now, the list administrator wants to match on three plus signs, but wants
178those emails to be discarded instead of held.
179
180    >>> header_matches.remove('x-spam-score', '[+]{3,}')
181    >>> header_matches.append('x-spam-score', '[+]{3,}', 'discard')
182
183A message with a spam score of three pluses will still match, and the message
184will be discarded.
185
186    >>> msgdata = {}
187    >>> del msg['x-spam-score']
188    >>> msg['X-Spam-Score'] = '+++'
189    >>> del msg['message-id']
190    >>> msg['Message-Id'] = '<dog>'
191    >>> with event_subscribers(handler):
192    ...     process(mlist, msg, msgdata, 'header-match')
193    DiscardEvent discard <dog>
194    >>> hits_and_misses(msgdata)
195    Rule hits:
196        x-spam-score: [+]{3,}
197    No rules missed
198