• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

MakefileH A D21-Jan-202118.8 KiB351318

READMEH A D21-Jan-20219.8 KiB189156

art.cH A D03-May-202277.7 KiB2,6421,989

cc.cH A D03-May-202251.3 KiB2,1731,695

chan.cH A D03-May-202240.8 KiB1,380945

icd.cH A D03-May-202211.5 KiB512385

innd.cH A D03-May-202223.2 KiB796563

innd.hH A D21-Jan-202126.4 KiB877651

keywords.cH A D03-May-20228.6 KiB269159

lc.cH A D03-May-20222.8 KiB11977

nc.cH A D03-May-202260.4 KiB2,0231,538

newsfeeds.cH A D03-May-202224.1 KiB942788

ng.cH A D03-May-202211.4 KiB447316

perl.cH A D03-May-202214.4 KiB565340

proc.cH A D03-May-20223.5 KiB186126

python.cH A D03-May-202221.2 KiB806512

rc.cH A D03-May-202254.3 KiB1,9171,564

site.cH A D03-May-202229.5 KiB1,236980

status.cH A D03-May-202214.1 KiB405359

tinyleaf.cH A D03-May-20228.7 KiB284202

util.cH A D03-May-20229.4 KiB386260

wip.cH A D03-May-20224.4 KiB188142

README

1                        Overview of innd Internals
2
3Introduction
4
5    innd is in many respects the heart of INN.  It is the transit
6    component of the news server, the component that accepts new articles
7    from peers or from nnrpd on behalf of local readers, stores them, and
8    puts information about them in the right places so that other programs
9    such as innxmit or innfeed can send them back to other peers.
10
11    innd is structured around channels.  With the exception of the active
12    file, the history database, the article and overview storage system,
13    and a few other things such as logs, everything coming into or going
14    out of innd is handled by a channel.  Each channel can be waiting to
15    read, waiting to write, or sleeping.  innd's main loop (in
16    CHANreadloop) calls select, passes control to each channel whose file
17    descriptor selected ready for reading or writing, and takes care of
18    other housekeeping (such as finding idle peers or waking up sleeping
19    channels at the right time).  The core channel routines are in chan.c,
20    with major classes of channels handled by cc.c, lc.c, nc.c, rc.c, and
21    site.c.  See below for more details on the types of channels.  The
22    routines in proc.c are used to manage processes spawned for outgoing
23    channels.
24
25    The storage and overview subsystem are mostly self-contained at this
26    point and INN is simply a client of the storage and overview APIs.
27    The history database is approaching that state, but some aspects (such
28    as the pre-commit cache handled by the WIP* family of routines in
29    wip.c) are still handled internally by innd.
30
31    Updates and queries of the active file are handled internally by innd
32    in the ICD* and NG* family of routines in icd.c and ng.c.
33
34    innd is configured primarily by incoming.conf (which controls who can
35    send articles) and newsfeeds (which controls where the articles should
36    go after they're received and stored).  The former is read in rc.c,
37    the file that also contains the RC* family of routines for dealing
38    with the remote connection channel (see below).  The latter is read by
39    newsfeeds.c and is used to set up all of the outgoing channels when
40    innd is started or told to re-read the file.  Incoming articles are
41    parsed and fed to the appropriate places by the routines in art.c.
42
43    Both Perl and Python embedded filters are supported.  The glue
44    routines to load and run the Perl or Python scripts are in perl.c and
45    python.c respectively.
46
47    Finally, keywords.c contains the support for synthesizing keywords
48    based on article contents, status.c writes out innd status
49    periodically if configured, util.c contains various utility functions
50    used by other parts of innd, and innd.c contains the startup,
51    initialization, and shutdown code as well as the main routine.
52
53Core Channel Handling
54
55    CHANreadloop is the main processing loop of innd.  As long as innd is
56    running, it will be inside that function.  The core channel code
57    maintains a table of channels, which have a one-to-one correspondence
58    with open file descriptors, and three file descriptor sets.  Each
59    channel is generally in one of the three sets (reading, writing, or
60    sleeping) at any given time.  The states should generally be
61    considered mutually exclusive, since NNTP is not asychronous and a
62    channel that's reading and writing at the same time is liable to
63    deadlock, but the core code doesn't assume that.
64
65    A channel fundamentally consists of two functions, a reader function
66    called whenever data is available for it to read and a write-done
67    function called when data it wrote has been completely written out.
68    If it is put to sleep, it also needs a function that is called when it
69    is woken up again.  Some channels may only read (such as the channels
70    that accept connections) and some channels may only write (such as
71    outgoing feeds), or channels may do both (like NNTP channels).
72
73    Reading is handled by the channel itself, since some channels don't
74    just read data from their file descriptor, but CHANreadtext is
75    provided for channels to call from their reader fuctions if they want
76    to read normally.  CHANreadtext puts the data into the channel's input
77    buffer and handles resizing and compacting the buffer as needed.  To
78    register as a reading channel, the channel calls RCHANadd, and then
79    its file descriptor will be added to the read set and its reader
80    function will be called whenever select indicates data is available.
81
82    Writing is handled by the channel core code; the channel just puts
83    data into its output buffer, usually using WCHANset or WCHANappend,
84    and then calls WCHANadd to tell the channel code that data is
85    available.  The data is written out as select indicates the file
86    descriptor can take it, and when the write is complete, the channel's
87    write-done function is called.
88
89    Channels are put to sleep if there's some reason why they must not be
90    allowed to do anything for some time.  Sleeping is generally used for
91    write channels that have encountered some (hopefully temporary) error
92    when writing, or which need to pause and spool output for a while
93    before writing it out.  They're also used for NNTP channels when the
94    server is paused.  A sleeping channel has an associated time to wake
95    up, an optional event that will wake it up earlier, and a function
96    that's called when it's woken up.  Sleeping is not used for writing
97    channels that just don't have any data at the moment to write; those
98    channels are just in none of the three states (which is also allowed).
99
100    The core channel code also supports prioritized channels.  Normally,
101    after each call to select returns, CHANreadloop walks through each
102    channel in turn, doing the appropriate work if the channel selected
103    for reading or writing or if it is time to wake it up.  However, on
104    each pass, the prioritized channels are checked first to see if they
105    selected for read, and if so, those reader functions are called
106    immediately and the number of other events that will be handled that
107    time through is capped (in case more data is available from the
108    prioritized channels immediately).  Only the control channel and the
109    remote connection channels are prioritized.
110
111Channel Types
112
113    The following channel types are implemented in innd:
114
115    Remote connections (CTremconn)
116
117        This is the channel that accepts new connections from remote
118        peers.  If innd is running in the mode where it accepts and hands
119        off reader connections to nnrpd, the remconn channel also does
120        this.  Its reader function doesn't actually read data, but rather
121        accepts the connection and creates a new NNTP channel.  These
122        channels are always prioritized.  The implementation is in rc.c.
123
124    NNTP (CTnntp)
125
126        Channels that speak NNTP to a peer (or to nnrpd or rnews feeding
127        articles to innd).  These channels are responsible for most of the
128        data stored in the channel struct.  They are probably the most
129        complex channels in innd and use all of the facilities of the
130        channel code.  The implementation is in nc.c, including all the
131        code to handle NNTP commands.
132
133    Reject (CTreject)
134
135        A special type of channel that exists solely to reject an unwanted
136        connection.  Peers who connect while the server is overloaded, who
137        try to open too many connections at once, or who have no access
138        (when innd is not handing connections to nnrpd) are handed off to
139        this type of channel.  All they do is write the rejection message
140        and then close themselves.
141
142    Local connections (CTlocalconn)
143
144        innd maintains a separate local Unix domain socket for the use of
145        nnrpd and rnews when injecting articles.  This channel type
146        handles incoming connections on that socket and spawns an NNTP
147        channel for them, similar to the remote connections channel.
148        These channels are not prioritized (but possibly should be).  The
149        implementation is in lc.c.
150
151    Control (CTcontrol)
152
153        innd can be given a wide variety of commands by external
154        processes, either automated ones like control message handling or
155        nightly expiration and log rotation or manual actions by the news
156        administrator.  The control channel handles incoming requests on
157        the Unix domain socket created for this purpose, runs the command,
158        and returns the results.  This Unix domain socket is a datagram
159        socket rather than a stream socket, so each command and response
160        are single datagrams, making the reader function a bit different
161        than other channels.  While the control channel writes its
162        response back, it doesn't use the write support in the core
163        channel code since it has to send a datagram; instead, it sends
164        the response immediately from the reader function.  There is only
165        one control channel and it is always prioritized.  The
166        implementation is in cc.c.
167
168    File (CTfile)
169    Exploder (CTexploder)
170    Process (CTprocess)
171
172        These channels are used to implement different types of outgoing
173        sites (outgoing channels configured in newsfeeds).  They are
174        created as needed by the site code in site.c and get data mostly
175        due to the processing of articles by art.c.  These channels are
176        mostly alike from the perspective of the channel code, but have
177        different types so that the site code can easily distinguish
178        between them.
179
180    In addition, the channel type CTany is used as a wildcard in some
181    channel operations and the type CTfree is used in the channel table
182    for free channels (corresponding to closed file descriptors).
183
184Article Handling
185Newsfeeds and Sites
186The Active File
187
188    To be written.
189