1junkfilter
2
3a junk e-mail filter system for procmail
4
5Copyright 1997-2002 Gregory Sutter <gsutter@zer0.org>
6
7$Id: README,v 2.25 2002/04/21 03:16:14 gsutter Exp $
8
9Contents:
10 0. What is junkfilter?
11 1. Use of junkfilter
12 2. How to get junkfilter
13 3. Mailing lists
14 4. Installation instructions
15 5. Sample .procmailrc file
16 6. Helping improve junkfilter
17 7. Customizing junkfilter
18 8. Contributors
19
200. What is junkfilter?
21
22 junkfilter is a spam filtering program built on top of the
23 procmail email delivery system. The goal is to create filter
24 sets that will block as much spam as possible. junkfilter
25 functions equally well at the individual-user level or at the
26 system level; however, since procmail can be slow and mail
27 volumes high, its general use at the system level is discouraged.
28 junkfilter makes an an excellent second stage spam filter when
29 coupled with a first-stage MTA-based ruleset.
30
31 As junkfilter requires the procmail system, it can only be
32 used on a Unix-like system. procmail does not work on
33 Windows; this is a procmail FAQ:
34 http://www.zer0.org/procmail/mini-faq.html#nt
35
361. Use of junkfilter
37 junkfilter is Copyright 1997-2002 Gregory Sutter.
38 All rights reserved.
39
40 junkfilter is licensed under a BSD-style license. See the
41 LICENSE file for the full text.
42
432. How to get junkfilter
44 The junkfilter web page is: http://junkfilter.zer0.org/
45 junkfilter and this documentation are available at the web site.
46
47 junkfilter has also been instantiated as a SourceForge
48 project. The URL there is:
49 http://sourceforge.net/projects/junkfilter/
50
51 junkfilter's CVS tree is available from:
52 http://sourceforge.net/cvs/?group_id=13498
53
543. Mailing lists
55 junkfilter has two mailing lists, an announce list and a
56 general-purpose list. If you wish to receive announcements
57 of new releases, subscribe by sending a message to
58 junkfilter-announce-subscribe@yahoogroups.com. If you wish
59 to also receive general mail from a two-way mailing list,
60 subscribe by sending a message to
61 junkfilter-users-subscribe@yahoogroups.com. Thanks to
62 eGroups.com, now part of Yahoo!, for hosting these lists.
63
644. Installation of junkfilter
65 We assume you've already got procmail installed and running
66 properly, as this is explicitly a "junk email filter system for
67 procmail". Consult the procmail documentation or the FAQ,
68 http://www.zer0.org/procmail/
69 if you need help installing procmail.
70
71 Set the $PMDIR variable. It is recommended that you make a
72 directory ".procmail" in your home directory and a symlink from
73 $HOME/.procmailrc to $HOME/.procmail/procmailrc:
74 mkdir -m 755 $HOME/.procmail
75 mv -i $HOME/.procmailrc $HOME/.procmail/procmailrc
76 ln -s $HOME/.procmail/procmailrc $HOME/.procmailrc
77 If you do this, you can set PMDIR=$HOME/.procmail
78
79 Place the junkfilter files wherever you want them.
80 $PMDIR/junkfilter is a likely choice. Set $JFDIR in
81 your procmailrc (for junkfilter to run) and in your shell
82 configuration files (for the Makefile) to the directory in
83 which you placed junkfilter.
84
85 In addition to the junkfilter files and default lists, you
86 can make blocklists of your own. To use these user blocklists,
87 set $JFUSERDIR to the directory in which you want your user
88 blocklists. If you're installing junkfilter all for yourself,
89 this can be the same as $JFDIR. If you share the base
90 junkfilter installation with other users on the system and
91 don't want to share the blocklists, then put these user lists
92 elsewhere, like $PMDIR/lists or $PMDIR.
93
94 To update the lists after modifying them, be sure that $JFDIR
95 and $JFUSERDIR are set in your current session, and use the
96 Makefile to parse the data files and build regular expressions
97 from them:
98 cd $JFDIR
99 make create
100 make all
101 You will find your $JFDIR populated with the default regexp
102 data files, and your $JFUSERDIR populated likewise for your own
103 data files. When you modify the data files, you'll have to
104 run 'make all'.
105
106 To begin using junkfilter to filter incoming mail, either follow
107 the instructions below to add to your existing .procmailrc file
108 or just use the included file procmailrc.sample by copying it to
109 your $PMDIR.
110
111 To use your existing procmailrc, place the following line in an
112 appropriate place in your procmailrc file:
113 INCLUDERC=$JFDIR/junkfilter
114 This will call junkfilter. All other junkfilter files are called
115 from within this first file. After mail finishes passing through
116 the included junkfilter system, it will not be changed or
117 filtered anywhere, but several procmail variables may be set.
118 Depending upon the contents of these variables, the message can
119 be filtered away to another mailbox so you don't have to read it.
120
121 After the INCLUDERC statement, this procmail recipe will filter
122 mail depending on whether junkfilter has marked the message as
123 spam. This example sends the junkmail to a mailbox called
124 "junkmail" in your $MAILDIR directory.
125
126 # Deal with mail that junkfilter has flagged.
127 :0
128 * JFEXP ?? .
129 {
130 # Check for whitelisted mail
131 :0 f
132 * JFSTATUS ?? 1
133 | formail -i "X-junkfilter: $JFVERSION" \
134 -i "X-Spammer: $JFEXP"
135
136 # File as spam
137 :0 E :
138 | formail -i "X-junkfilter: $JFVERSION" \
139 -i "X-Spammer: $JFEXP" >> junkmail
140 }
141
142 Here is an example demonstrating how to process mail, yet
143 perform the filtering within your e-mail program instead of
144 sending spam to a different directory. (Tell your email
145 program to check for the presence of an X-Is-Spam: header.)
146
147 # Deal with mail that junkfilter has flagged.
148 :0
149 * JFEXP ?? .
150 {
151 # Check for whitelisted mail
152 :0 f
153 * JFSTATUS ?? 1
154 | formail -i "X-junkfilter: $JFVERSION" \
155 -i "X-Spammer: $JFEXP"
156
157 # File as spam
158 :0 E f
159 | formail -i "X-junkfilter: $JFVERSION" \
160 -i "X-Spammer: $JFEXP" \
161 -i "X-Is-Spam: YES"
162 }
163
164 In addition to these examples, you can change the action recipe
165 to whatever you prefer. The most common change will be the name
166 of the mailbox in which the junk mail is stored. You can change
167 it to /dev/null if you wish, but remember that no matter how good
168 the filter, mistakes will be made. The author does NOT recommend
169 immediately discarding any mail filtered by junkfilter.
170
171 You now have a basic junkfilter setup. You now need to configure
172 junkfilter to fit your every desire. Edit junkfilter.config and
173 change the various options from 0 to 1 and vice-versa. 0 means
174 "false"; 1 means "true". A given piece of code will only
175 execute if it is set true. Please read the comments at the
176 beginning of each one before changing anything.
177
178 If you are installing junkfilter as a systemwide solution, and
179 want each user to have customizable defaults, you can copy the
180 junkfilter.config file to their home directories, calling it
181 ".junkfilterrc". junkfilter will check $HOME/.junkfilterrc for
182 local configuration overrides each time it is called.
183
184 You can change the default action of jf to whatever you prefer.
185 The only action command in junkfilter is to set the variable
186 JFEXP to a relevant piece of text. It is up to you to then
187 take some action. Since you've called junkfilter from your
188 .procmailrc file, you can easily take action depending on the
189 output (in the JFEXP variable) of junkfilter.
190
191 The whitelist feature is a way of making sure that certain
192 people/mails are not blocked, even if junkfilter would block
193 them ordinarily. The implementation of the whitelist does not
194 break compatibility with older releases of junkfilter, but
195 does require that a more complex set of recipes be used to
196 decide whether or not to take action on the message.
197
1985. Sample .procmailrc file that calls junkfilter
199 Please see the file procmailrc.sample for a working example
200 of how to call junkfilter from your procmailrc. If you have
201 no other procmail recipes, you can simply install this in
202 $PMDIR and make a symbolic link to it from your $HOME.
203
2046. Helping improve junkfilter
205 If you know procmail, or would like an example of a working
206 procmail-based tool of medium complexity to play with or
207 hack around on, take a look at the junkfilter code. There
208 are lots of ways to improve the system. Please submit
209 bugs (and preferably patches) to the management systems at
210 SourceForge:
211 bugs:
212 http://sourceforge.net/tracker/?atid=113498&group_id=13498&func=browse
213 patches:
214 http://sourceforge.net/tracker/?atid=313498&group_id=13498&func=browse
215
216 Bugs, patches, questions, and comments may also be posted
217 to the junkfilter-users mailing list or to the author. Note
218 that emailing the author directly has the lowest probability
219 of receiving a timely response.
220
221 junkfilter users who wish to see more of their spam caught by
222 the filter in the future may wish to forward their spam which
223 was _not_ caught by junkfilter to an email address set up for
224 this purpose. To do this, you should enable JF_OPT_SENDBACK in
225 your junkfilter.config file. It's near the end. This will
226 enable some settings that will mark each email that passes
227 through junkfilter. If a spam has the headers added by this
228 setting, it will be accepted at the email address
229 <junkfilter-misses@lists.sourceforge.net>. Spam sent here may
230 be analyzed and used to improve junkfilter.
231
232 Only spam that has been processed by junkfilter, yet not
233 caught, will be of use. junkfilter must have JF_OPT_SENDBACK
234 enabled to be of use.
235
236 An easy way to test junkfilter when modifying the code is to
237 put a sample e-mail in a file such as 'test-mail' then invoke
238 procmail directly with:
239 procmail < test-mail > test-output 2>&1
240 If you're using a csh variant, the command line is:
241 procmail < test-mail >& test-output
242
243 You can then look at the test-output file to see how procmail
244 handled the test e-mail. The sample e-mail you put in
245 'test-mail' should be the raw source of the message including
246 all headers.
247
2487. Customizing junkfilter
249 junkfilter can be customized in three ways: through the
250 junkfilter.user file, through the individual section user
251 files, or by modifying the procmail code directly.
252
253 The file junkfilter.user is provided as a convenient place for
254 you to store your own personal junk filtration recipes. If you
255 follow the recommended format (given at the beginning of that
256 file), junkfilter will treat your recipes the same as the rest
257 of the files. The "user" section is the first section checked
258 when junkfilter is called. In the distribution, the stock
259 junkfilter.user is called junkfilter.user-default so that your
260 personalized copy is not overwritten when you upgrade later.
261
262 The user files for each individual section (domains, bodychk,
263 etc.) are made up of lists of regular expressions. For each
264 file, which has the same name as the corresponding built-in
265 section, suffixed with "-user", as in 'bodychk-user', add
266 each entry on its own line. To compile your lists into the
267 format that junkfilter can use, run the 'jf' utility with the
268 arguments 'build' and the section name you're building:
269 jf build bodychk-user
270 jf build domains-user
271 The shortcut section name "all-user" will build all of the
272 user configurable data files:
273 jf build all-user
274 When you use the 'jf' utility in this manner, it will take
275 your raw data files and build files with names like
276 'jf-bodychk-user' or 'jf-ip-user'. These files should not
277 be edited directly, or your changes will be lost when you
278 next use 'jf' to rebuild them.
279
280 If you enable any of the user- rules or options in junkfilter,
281 you MUST be sure that the files referenced by them in
282 junkfilter.config exist! This means that you must rename the
283 files distributed as *-default, removing the dash and the word
284 "default". If you don't do this, the most likely occurrence is
285 that all your mail will be classified as junk.
286
2878. Contributors to junkfilter
288 Many people have contributed to junkfilter in various ways; the
289 author would like to thank the following people in particular:
290
291 Matthew Hunt <mph@pobox.com>, who co-developed junkfilter for the
292 first few months. Thanks, Matt!
293 Jeff A. Earickson <jaearick@colby.edu>
294 Era Eriksson <era@iki.fi>
295 Brian Goetz <brian@quiotix.com>
296 Philip Guenther <guenther@gac.edu>
297 Brad Knowles <brad@his.com>
298 Bryan D. McMeen <bryan.mcmeen@symtecinc.com>
299 John Perry <perry@jpunix.com>
300 Edward Sabol <sabol@alderaan.gsfc.nasa.gov>
301 David Tamkin <dattier@wwa.com>
302 John Wilkes <john@wilkes.com>
303 and
304 the procmail mailing list <procmail@Informatik.RWTH-Aachen.DE>
305
306
1- junkfilter.lists is a general-purpose abstraction for processing
2 lists of suspect patterns in message headers. Its intended
3 specification is along these lines:
4 - calling params:
5 - JFSECTION
6 names a particular type of "suspect" list, e.g. the string "ip" for
7 suspect IP addresses; used to retrieve the relevant file, in this case
8 jf-ip from the JFDIR directory. JFLIST is set by reading in the
9 contents of $JFDIR/jf-$JFSECTION
10
11 - JFVER
12 a "left context" for a suspect pattern - i.e., in order to be
13 considered a Match, the suspect pattern must be immediately preceded
14 by the JFVER pattern
15
16 - JFVERR
17 a "right context" for a suspect pattern - in order to be considered
18 a Match, the suspect pattern must be immediately followed by the
19 JFVERR pattern
20
21 - output params:
22 - JFMATCH set as usual for all junkfilter matches, contents will be
23 appended to JFEXP by the standard processing of junkfilter.match
24
25