• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

Razor2-Preproc-deHTMLxs/H03-May-2022-2,0081,040

bin/H03-May-2022-18162

docs/H03-May-2022-985613

lib/Razor2/H23-May-2007-7,0054,503

BUGSH A D03-May-20222.1 KiB9450

CREDITSH A D03-May-2022462 1610

ChangesH A D03-May-202217.9 KiB1,035589

FAQH A D03-May-20223.7 KiB10573

INSTALLH A D03-May-20222.8 KiB8960

MANIFESTH A D03-May-20221.9 KiB7265

META.ymlH A D03-May-2022613 1917

Makefile.PLH A D03-May-20224 KiB14893

READMEH A D03-May-20225.5 KiB12191

SERVICE_POLICYH A D03-May-2022374 1610

README

1
2                    Vipul's Razor v2 README
3
4Vipul's Razor is a distributed, collaborative, spam detection and
5filtering network. Through user contribution, Razor establishes a
6distributed and constantly updating catalogue of spam in propagation that
7is consulted by email clients to filter out known spam. Detection is done
8with statistical and randomized signatures that efficiently spot mutating
9spam content. User input is validated through reputation assignments based
10on consensus on report and revoke assertions which in turn is used for
11computing confidence values associated with individual signatures.
12
13Vipul's Razor v2 agent software is available from project's homepage at
14http://razor.sf.net. Razor Agents are written in Perl and will work on
15most Unix operating systems and others OSes for which perl is available.
16Installation and usage instructions can be found in the INSTALL document
17in the distribution.
18
19Vipul's Razor v2 is almost a complete rewrite of Razor v1. The following
20is a list of the most significant new features:
21
22 1 New Protocol
23
24    The Razor v2 protocol has been completely redesigned. The new
25    protocol is based on exchange of _Structured Information Strings_,
26    that are similar to URIs and can be parsed with URI decoding
27    libraries. v2 protocol supports _Pipelining_, which means Razor
28    Agents can keep a connection open with server to eliminate the
29    latency introduced by TCP 3-way handshake and 4-way breakdown for
30    every connection. The new protocol semantics allow seamless
31    introduction of new signature schemes.
32
33 2 Ephemeral Signatures
34
35    Ephemeral Signatures are short-lived signatures based on
36    collaboratively computed random numbers. Ephemeral Signatures select a
37    section of text from the spam message based on a random number that
38    changes every so often. This makes the hashing scheme a moving target,
39    and spammers can't exploit it because they don't know which part of
40    the message will be hashed after the random number rollover.
41
42 3 Preprocessors
43
44    Razor v2 supports several preprocessors. Preprocessors alter the the
45    text of a spam before a hash is computed. This version includes
46    preprocessors to decode Base64 encoded messages, decode QP encoded
47    messages and convert HTML to plaintext. Spammers employ several
48    techniques that hide mutations in various encoding. Preprocessors
49    defeat such techniques by hashing the content that a recipient
50    actually sees in his/her mail user agent.
51
52 4 Multiple Filteration Engines
53
54    Razor v2 supports multiple engines. An engine is logical unit that
55    encapsulates a particular type of filteration service. Razor v2
56    currently supports four engines - VR1 which is equivalent to Razor v1,
57    VR2 that is based on SHA1 signatures of bodytext, VR3 that is based on
58    Nilsimsa signatures, and VR4 based on Ephemeral hashes. New engines
59    can be seamlessly plugged into the service as and when required.
60
61 5 Complete Backward Compatibility with Razor v1
62
63    The VR1 engine is functionally equivalent to the Razor v1 service and
64    uses the same database. This means users who transition from v1 to v2
65    will still get the benefit of several million signatures known to the
66    v1 service.
67
68 6 Base64 signature encoding
69
70    Signatures are now encoded as base 64 numbers instead of base 16
71    (hex), reducing traffic that goes over the wire by 33%.
72
73 7 Truth Evaluation System (TeS)
74
75    Razor v2 has a transparent, back-end component known as TeS. TeS is a
76    combination of a reputation system and pattern recognition heuristics
77    that assigns trust to reporters and confidence values (between 0-100)
78    to every signature. Users can set an acceptable confidence level in
79    their Razor configuration. The server also publishes a recommended
80    confidence level. TeS has been designed to eliminate false positives
81    of legit bulk email that were occasionally generated by bad reports
82    in Razor v1.
83
84 8 Submission of entire spam messages
85
86    Razor v2 accepts the entire body text of spam messages not previously
87    known to the system. This lets Razor v2 compute new Ephemeral
88    Signatures every n hours as well as seed the database whenever a new
89    signature scheme and/or preprocessor is introduced. It should be noted
90    that Razor v2 _does not_ accept contents of legit email during a check
91    dialogue. Only signatures are sent when checking email.
92
93 9 Revocation
94
95    Razor v2 allows users to revoke messages that they don't consider to
96    be spam. Revocation input is fed into TeS, that adjusts the confidence
97    value of a signature or remove it from the database as necessary.
98    Revocation is done through a tool called razor-revoke, which is a part
99    of the new Razor distribution.
100
10110 Reporter Registration
102
103    Razor v2 requires reporters to be registered. This lets reporters
104    build a reputation over time, so their reports and revocations are
105    weighed according to their reputation value. Report requires users to
106    authenticate which is done using a CRAM-SHA1 authentication scheme.
107
10811 Content classes
109
110    Razor v2 introduces the concept of content classes. A content class is
111    a set of messages that represents variations on the same content. As
112    new reports come in, Nomination servers associate them to an existing
113    content class, if a (close) match is found. Additionally, Razor v2
114    treats each MIME attachment is a separate content class, so spammers
115    MIME attachment can be individually tracked (which is very useful in
116    case of viruses).
117
118
119              $Id: README,v 1.4 2005/06/28 22:19:07 jpr5 Exp $
120
121