1This is a rough and informal list of suggested improvements to INN, parts
2of INN that need work, and other tasks yet undone.  Some of these may be
3in progress, in which case the person working on them will be noted in
4square brackets and should be contacted if you want to help.  Otherwise,
5let <inn-workers@lists.isc.org> know if you'd like to work on any item listed
6below.
7
8This list is currently being migrated to INN's issue tracking system
9<https://inn.eyrie.org/trac/>.
10
11The list is divided into changes already tentatively scheduled for a
12particular release, higher priority changes that will hopefully be done in
13the near future, small or medium-scale projects for the future, and
14long-term, large-scale problems.  Note that just because a particular
15feature is scheduled for a later release doesn't mean it can't be
16completed earlier if someone decides to take it on.  The association of
17features with releases is intended to be a rough guide for prioritization
18and a set of milestones to use to judge when a new major release is
19justified.
20
21Also, one major thing that is *always* welcome is additions to the test
22suite, which is currently very minimal.  Any work done on the test suite
23to allow more portions of INN to be automatically tested will make all
24changes easier and will be *greatly* appreciated.
25
26Last modified $Id: TODO 9949 2015-09-12 13:48:45Z iulius $.
27
28
29Scheduled for INN 2.7
30
31* Rework and clean up the overview API.  The major change is that the
32  initialization function should return a pointer to an opaque struct
33  which stores all of the state of the overview subsystem, rather than all
34  of that being stored in static variables, and then all other functions
35  should take that pointer.  [The new API is available in CURRENT and
36  conversion of portions of INN to use it is in progress.  Once that
37  conversion is done, the old API can be dropped and the overview backends
38  converted to speak the new API at all levels.]
39
40* The TLS layer in nnrpd badly needs to be rewritten.  Currently, it
41  doesn't handle timeouts and the code could be cleaned up considerably.
42  Support for GnuTLS would also be nice, as would client support.
43
44* Convert readers.conf and storage.conf (and related configuration files)
45  to use the new parsing system and break out program-specific sections
46  of inn.conf into their own groups.
47
48* The current WIP cache and history cache should be integrated into the
49  history API, things like message ID hashing should become a selectable
50  property of the history file, and the history API should support
51  multiple backend storage formats and automatically select the right one
52  for an existing history file based on stored metainformation.
53
54* The interface to embedded filters needs to be reworked.  The information
55  about which filters are enabled should be isolated in the filtering API,
56  and there should be standard API calls for filtering message IDs, remote
57  posts, and local posts.  As part of this revision, all of the Perl
58  callbacks should be defined before any of the user code is loaded, and
59  the Perl loading code needs considerable cleanup.  At the same time as
60  this is done, the implementation should really be documented; we do some
61  interesting things with embedded filters and it would be nice to have a
62  general document describing how we do it.  [Russ is planning on working
63  on this at some point, but won't get upset if someone starts first.]
64
65* All of INN's documentation should be written in POD, with text and man
66  pages generated from the POD source.  Anyone is encouraged to work on
67  this by just taking any existing documentation in man format and convert
68  it to POD while checking that it's still accurate and adding any
69  additional useful information that was missed.
70
71* Include the necessary glue so that Perl modules can be added to INN's
72  build tree and installed with INN, allowing their capabilities to be
73  available to the portions of INN written in Perl.
74
75* Switch nnrpd over to using the new wildmat routines rather than breaking
76  apart strings on commas and matching each expression separately.  This
77  involves a lot of surgery, since PERMmatch is used all over the place,
78  and may change the interpretation of ! and @ in group permission
79  wildmats.
80
81* Rework and clean up the storage API.  The major change is that the
82  initialization function should return a pointer to an opaque struct
83  which stores all of the state of the storage subsystem, rather than all
84  of that being stored in static variables, and then all other functions
85  should take that pointer.  More of the structures should also be opaque,
86  all-caps structure names should be avoided in favor of named structures,
87  SMsetup and SMinit should be combined into one function that takes
88  flags, SMerrno and SMerrorstr should be replaced with functions that
89  return that information, and the wire format utilities should be moved
90  into libinn.
91
92
93Scheduled for INN 2.8
94
95* Add a generic, modular anti-spam and anti-abuse filter, off by default,
96  but coming with INN and prominently mentioned in the INSTALL
97  documentation.  [Andrew Gierth has work in progress that may be usable
98  for this.]
99
100* A unified configuration file combining the facilities of newsfeeds,
101  incoming.conf, and innfeed.conf, but hopefully more readable and easier
102  for new INN users to edit.  This should have all of the capabilities of
103  the existing configuration files, but specifying common things (such as
104  file feeds or innfeed feeds) should be very simple and straightforward.
105  This configuration file should use the new parsing infrastructure.
106
107* Convert all remaining INN configuration files to the new parsing
108  infrastructure.
109
110* INN really should be capable of both sending and receiving a
111  headers-only feed (or even an overview-only feed) similar to Diablo and
112  using it for the same things that Diablo does, namely clustering,
113  pull-on-demand for articles, and the like.  This should be implementable
114  as a new backend, although the API may need a few more hooks.  Both a
115  straight headers-only feed that only pulls articles down via NNTP from a
116  remote server and a caching feed where some articles are pre-fed, some
117  articles are pulled down at first read, and some articles are never
118  stored locally should be possible.  [A patch for a header-only feed
119  from innfeed is included in ticket #17.]
120
121* The libinn, libstorage, and other library interfaces should be treated
122  as stable libraries and properly versioned using libtool's
123  recommendation for library versioning when changes are made so that they
124  can be installed as shared libraries and work properly through releases
125  of INN.  This is currently waiting on a systematic review of the
126  interface and removal of things that we don't want to support long-term.
127
128* The include files necessary to use libinn, libstorage, and other
129  libraries should be installed in a suitable directory so that other
130  programs can link against them.  All such include files should be under
131  include/inn and included with <inn/header.h>.  All such include files
132  should only depend on other inn/* header files and not on, e.g.,
133  config.h.  All such include files should be careful about namespace to
134  avoid conflicts with other include files used by applications.
135
136
137High Priority Projects
138
139* INN shouldn't flush all feeds (particularly all program feeds) on
140  newgroup or rmgroup.  Currently it reloads newsfeeds to reparse all of
141  the wildmat patterns and rebuild the peer lists associated with the
142  active file on group changes, and this forces a flush of all feeds.
143  The best fix is probably to stash the wildmat pattern (and flags) for
144  each peer when newsfeeds is read and then just using the stashed copy on
145  newgroup or rmgroup, since otherwise the newsfeeds loading code would
146  need significant modification.  But in general, innd is too
147  reload-happy; it should be better at making incremental changes without
148  reloading everything.
149
150* Add authenticated Path support, based on USEPRO (RFC 5537).
151  [Andrew Gierth wrote a patch for part of this a while back, which
152  is included in ticket #19.  Marco d'Itri expressed some interest in
153  working on this.]
154
155* Various parts of INN are using write or writev; they should all use
156  xwrite or xwritev instead.  Even for writes that are unlikely to ever be
157  partial, on some systems system calls aren't restartable and xwrite and
158  xwritev properly handle EINTR returns.
159
160* Apparently on Solaris open can also be interrupted by a signal; we may
161  need to have an xopen wrapper that checks for EINTR and retries.
162
163* tradspool has a few annoying problems.  Deleted newsgroups never have
164  their last articles expired, and there is no way of forcibly
165  resynchronizing the articles stored on disk with what overview knows
166  about unless tradindexed is used.  Some sort of utility program to take
167  care of these and to do things like analyze the tradspool.map file
168  should be provided.
169
170* contrib/mkbuf and contrib/reset-cnfs.c should be combined into a utility
171  for creating and clearing cycbuffs, perhaps combined with cnfsheadconf,
172  and the whole thing moved into storage/cnfs rather than frontends (along
173  with cnfsstat).  pullart.c may also stand to be merged into the same
174  utility (cnfs-util might not be a bad name).
175
176* The Berkeley DB integration of INN needs some improvements in robustness.
177  Currently, BerkeleyDB functions can be called by nnrpd out of signal
178  handlers and in other unfortunate situations, and coordination between
179  nnrpd and innd isn't entirely robust.  Berkeley DB 4.4 offers a new
180  DB_REGISTER flag to open to allow for multi-process use of Berkeley DB
181  databases and use of that flag should be investigated.
182
183
184Documentation Projects
185
186* Add man pages for all libinn interfaces.  There should be a subdirectory
187  of doc/pod for this since there will be a lot of them; installing them
188  as libinn_<section>.3 seems to make the most sense (so, for example,
189  error handling routines would be documented in libinn_error.3).
190
191* Better documentation of and support for UUCP feeds.  send-uucp is now
192  easier to use, but there's still a paucity of documentation covering the
193  whole theory and mechanisms of UUCP feeding.
194
195* Everything installed by INN should have a man page.  Currently, there
196  are several binaries and configuration files that don't have man pages.
197  (In some cases, the best thing to do with the configuration file may be
198  to merge it into another one or find a way to eliminate it.)
199
200* Document the internal formats of the various overview methods, CNFS,
201  timehash, and timecaf.  A lot of this documentation already exists in
202  various forms, but it needs to be cleaned up and collected in one place
203  for each format, preferrably as a man page.
204
205* Add documentation for slave servers.  [Russ has articles from
206  inn-workers that can be used as a beginning.]
207
208* Write complete documentation for all of our extensions to RFC 3977 or RFC
209  5536 and 5537, preferrably in a format that could be suitable for future
210  inclusion into new revisions of the RFCs.
211
212* More comprehensive documentation in texinfo would be interesting; it
213  would allow for better organization, separation of specialized topics
214  into cleaner chapters, and a significantly better printed manual.  This
215  would be a tremendous amount of work, though.
216
217
218Code Cleanup Projects
219
220* Eliminate everything in the LEGACY section of include/inn/options.h.
221
222* Go over include/inn/options.h and try to eliminate as many of the
223  compile-time options there as possible.  They should all be run-time
224  options instead if at all possible, maybe in specific sub-sections of
225  inn.conf.
226
227* Check to be sure we still need all of the #defines in
228  include/inn/paths.h and look at adding anything needed by innfeed (and
229  eliminating the separate innfeed header serving the same purpose).
230
231* Use vectors or cvectors everywhere that argify and friends are currently
232  used and eliminate the separate implementation in nnrpd/misc.c.
233
234* Break up the remainder of inn/libinn.h into multiple inn/* include files
235  for specific functions (such as memory management, wildmat, date handling,
236  NNTP commands, etc.), with an inn/util.h header to collect the remaining
237  random utilities.  Consider adding some sort of prefix, like inn_, to all
238  functions that aren't part of some other logical set with its own prefix.
239
240* Break the CNFS and tradspool code into multiple source files to make it
241  easier to understand the logical divisions of the code and consider
242  doing the same with the other overview and storage methods.
243
244* The source and bind addresses currently also take "any" or "all"
245  wildcards, which can more easily be specified by just not setting them
246  at all.  Remove those special strings, modify innupgrade to fix inn.conf
247  files using them, and simplify the code.  (It's not completely clear
248  that this is the right thing to do.)
249
250
251Needed Bug Fixes
252
253* Don't require an Xref slave carry all of the groups of its upstream.
254  Fixing this will depend on the new overview API (so that overview
255  records are stored separately in each group under innd's control) and
256  will require ignoring failures to store overview records because the
257  group doesn't exist, or checking first to ensure the group exists
258  before trying to store the record.
259
260* tradspool currently uses stdio to write out tradspool.map, which can
261  cause problems if more than 256 file descriptors are in use for other
262  things (such as incoming connections or tradindexed overview cache).
263  It should use write() instead.
264
265* LIST NEWSGROUPS should probably only list newsgroups that are marked in
266  the active file as valid groups.
267
268* INN's startup script should be sure to clean out old lock files and PID
269  files for innfeed.  Be careful, though, since innfeed may still be
270  running, spawned from a previous innd.
271
272* makedbz should be more robust in the presence of malformed history
273  lines, discarding with them or otherwise dealing with them.
274
275* Some servers reject some IHAVE, TAKETHIS, or CHECK commands with 500
276  syntax errors (particularly for long message IDs), and innfeed doesn't
277  handle this particularly well at the moment.  It really should have an
278  error handler for this case.  [Sven Paulus has a preliminary patch that
279  needs testing, included in ticket #26.]
280
281* Editing the active file by hand can currently munge it fairly badly even
282  if the server is throttled unless you reload active before restarting
283  the server.  This could be avoidable for at least that particular case
284  by checking the mtime of active before and after the server was
285  throttled.
286
287* innreport silently discards news.notice entries about most of the errors
288  innfeed generates.  It should ideally generate some summary, or at least
289  note that some error has occurred and the logs should be examined.
290
291* Handling of compressed batches needs to be thoroughly reviewed by
292  someone who understands how they're supposed to work.  It's not clear
293  that INN_PATH_GZIP is being used correctly at the moment and that
294  compressed batch handling will work right now on systems that don't have
295  gzip installed (but that do have uncompress).
296
297* innfeed's statistics don't add up properly all the time.  All of the
298  article dispositions don't add up to the offered count like they should.
299  Some article handling must not be recorded properly.
300
301* If a channel feed exits immediately, innd respawns it immediately,
302  causing thrashing of the system and a huge spew of errors in syslog.  It
303  should mark the channel as dormant for some period of time before
304  respawning it, perhaps only if it's already died multiple times in a
305  short interval.
306
307* ctlinnd begin <site-name> was causing innd to core dump.
308
309* Handling of innfeed's dropped batches needs looking at.  There are three
310  places where articles can fall between the cracks:  an innfeed.togo file
311  written by innd when the feed can't be spawned, a batch file named after
312  the feed name which can be created under similar circumstances, and the
313  dropped files written by innfeed itself.  procbatch can clean these up,
314  but has to be run by hand.
315
316* When using tradspool, groups are not immediately added to tradspool.map
317  when created, making innfeed unable to find the articles until after
318  some period of time.  Part of the problem here is that tradspool only
319  updates tradspool.map on a lazy basis, when it sees an article in that
320  group, since there is no storage hook for creation of a new group.
321
322* nntpget doesn't handle long lines in messages.
323
324* WP feeds break if there are spaces in the Path header, and the inn.conf
325  parser doesn't check for this case and will allow people to configure
326  their server that way.  (It's not clear that the latter is actually a
327  bug, given the new USEFOR attempt to allow folding of Path headers, but
328  the space needs to be removed for WP feeds.)
329
330* innd returns 437 for articles that were accepted but filed in the junk
331  group.  It should probably return the appropriate 2xx status code in
332  that case instead.
333
334* SIGPIPE handling in nnrpd calls all sorts of functions that shouldn't be
335  called from inside a signal handler.
336
337* Someone should go through the BUGS sections of all of the man pages and
338  fix those for which the current behavior is unacceptable.
339
340* The ovdb server utilities don't handle unclean shutdowns very well.
341  They may leave PID files sitting around, fail to start properly, and
342  otherwise not do what's expected.  This area needs some thought and a
343  careful design.
344
345* tdx-util can't fix hash chain loops of length greater than one, and they
346  cause both tdx-util -F and innd to hang.
347
348* CNFS is insufficiently robust when corrupt buffers are encountered.
349  Setting cnfscheckfudgesize clears up issues that otherwises causes INN
350  to crash.
351
352* There should be a way, with the Perl authentication hooks, to either
353  immediately return a list of newsgroups one has access to based on the
354  hostname or to indicate that authentication is required and make the
355  user be prompted with a 480 code when they try to access anything.
356  Right now, that doesn't appear to be possible.
357
358* Handling of article rejections needs a lot of cleanup in innd/art.c so
359  that one can cleanly indicate whether a given rejection should be
360  remembered or not, and the code that does the remembering needs to be
361  refactored and shared.  Once this is done, we need to not remember
362  rejections for duplicated Xref headers.
363
364* rnews's error handling for non-zero exits by child batch decompressers
365  could use some improvement (namely, there should be some).  Otherwise,
366  when gzip fails for some reason, we read zero bytes and then throw
367  away the batch without realizing we have an error.
368
369
370Requested New Features
371
372* Consider implementing the HEADERS command as discussed rather
373  extensively in news.software.nntp.  HEADERS was intended as a general
374  replacement for XHDR and XPAT.  [Greg Andruk has a preliminary patch.]
375
376* There have been a few requests for the ability to programmatically set
377  the subject of the report generated by news.daily, with escapes that are
378  filled in by the various pieces of information that might be useful.
379
380* A bulk cancel command using the MODE CANCEL interface.  Possibly through
381  ctlinnd, although it may be a bit afield of what ctlinnd is currently
382  for.
383
384* Sven Paulus's patch for nnrpd volume reports should be integrated.
385  [Patch included in ticket #135.]
386
387* Lots of people encrypt Injection-Info in various ways.  Should that be
388  offered as a standard option?  The first data element should probably
389  remain unencrypted so that the O flag in newsfeeds doesn't break.
390
391  Should there also be an option not to generate Injection-Info?
392
393  Olaf Titz suggests for encryption:
394
395      This can be done by formatting injection fields in a way that they
396      are always a multiple of 8 bytes and applying a 64 bit block cipher
397      in ECB mode on it (for instance "395109AA000016FF").
398
399* ctlinnd flushlogs currently renames all of the log files.  It would be
400  nice to support the method of log rotation that most other daemons
401  support, namely to move the logs aside and then tell innd to reopen its
402  log files.  Ideally, that behavior would be triggered with a SIGHUP.
403  scanlogs would have to be modified to handle this.
404
405  The best way to support this seems to be to leave scanlogs as is by
406  default, but also add two additional modes.  One would flush all the
407  logs and prepare for the syslog logs to be rotated, and the other would
408  do all the work needed after the logs have been rotated.  That way, if
409  someone wanted to plug in a separate log rotation handler, they could do
410  so and just call scanlogs on either side of it.  The reporting portions
411  of scanlogs should be in a separate program.
412
413* Several people have Perl interfaces to pieces of INN that should ideally
414  be part of the INN source tree in some fashion.  Greg Andruk has a bunch
415  of stuff that Russ has copies of, for example.  [Patches included in
416  tickets #133 and #134.]
417
418* There are various available patches for Cancel-Lock and an Internet
419  draft; support should be added to INN for both generation and
420  verification (definitely optional and not on by default at this point).
421
422* It would be nice to be able to reload inn.conf (although difficult, due
423  to the amount of data that's generated from it and stashed in various
424  places).
425
426* remembertrash currently rejects and remembers articles with syntax
427  errors as well as things like unwanted newsgroups and unwanted
428  distributions, which means that if a peer sends you a bunch of mangled
429  articles, you'll then also reject the correct versions of the articles
430  from other peers.  This should probably be rethought.
431
432* Additional limits for readers.conf:  Limit on concurrent parallel reader
433  streams, limit on KB/second download (preliminary support for this is
434  already in), and a limit on maximum posted articles per day (tied in
435  with the backoff stuff?).  These should be per-IP or per-user, but
436  possibly also per-access group.  (Consider pulling the -H, -T, -X, and
437  -i code out from innd and using it here.)
438
439* timecaf should have more configurable parameters (at the least, how
440  frequently to switch to a new CAF file should be an option).
441  storage.conf should really be extended to allow method-specific
442  configuration for things like this (and to allow the cycbuff.conf file
443  to be merged into storage.conf).
444
445* Allow generation of arbitrary additional information that could go in
446  overview by using embedded Perl or Python code.  This might be a cleaner
447  way to do the keywords code, which really wants Perl's regex engine
448  ideally.  It would also let one do something like doing MD5 hashes of
449  each article and putting that in the overview if you care a lot about
450  making sure that articles aren't corrupted.
451
452* Allow some way of accepting articles regardless of the Date header, even
453  if it's far into the future.  Some people are running into articles that
454  are dated years into the future for some reason that they still want to
455  store on the server.
456
457* There was a request to make --program-suffix and the other name
458  transformation options to autoconf work.  The standard GNU package does
459  this with really ugly sed commands in the Makefile rules; we could
460  probably do better, perhaps by substituting the autoconf results into
461  support/install-sh.
462
463* INN currently uses hash tables to store the active file internally.  It
464  would be worth trying ternary search trees to see if they're faster; the
465  data structure is simpler, performance may be comparable for hits and
466  significantly better for misses, sizing and resizing becomes a non-issue,
467  and the space penalty isn't too bad.  A generic implementation is already
468  available in libinn.  (An even better place to use ternary search trees
469  may be the configuration parser.)
470
471* Provide an innshellvars equivalent for Python.
472
473* inncheck should check the syntax of all the various files that are
474  returned by LIST commands, since having those files present with the
475  wrong syntax could result in non-compliant responses from the server.
476  Possibly the server should also refuse to send malformatted lines to
477  the client.
478
479* ctlinnd reload incoming.conf could return a count of the hosts that
480  failed, or even better a list of them.  This would make pruning old
481  stuff out of incoming.conf much easier.
482
483* nnrpd could use sendfile(2), if available, to send articles directly
484  to the socket (for those storage methods where to-wire conversion is
485  not needed).  This would need to be added to the storage API.
486
487* Somebody should look at keeping the "newsgroups" file more accurate
488  (e.g. newgroups for existing groups should change description, better
489  checkgroups handling, checking for duplicates).
490
491* The by-domain statistics innreport generates for nnrpd count all local
492  connections (those with no "." in the hostname) in with the errors as
493  just "?".  The host2dom function could be updated to group these as
494  something like "Local".
495
496* news.daily could detect if expire segfaults and unpause the server.
497
498* When using SSL, track the amount of data that's been transferred to the
499  client and periodically renegotiate the session key.
500
501* When using SSL, use SSL_get_peer to get a verified client certificate,
502  if available, and use it to create an additional header line when
503  posting articles (X-Auth-Poster?).  This header could use:
504
505      X509_NAME_oneline(X509_get_subject_name(peer),...)
506
507  for the full distinguished name, or
508
509      X509_name_get_text_by_NID(X509_get_subject_name(peer),
510                                NID_commonName, ...)
511
512  for the client's "common name" alone.
513
514* When using SSL, use the server's key to generate an HMAC of the body of
515  the message (and most headers?), then include that digest in the
516  headers.  This allows a news administrator to determine if a complaint
517  about the content of a message is fraudulent since the message was
518  changed after transmission.
519
520* Allow permission for posting cancels to be configured in readers.conf
521  in an access block.
522
523* Allow the applicability of auth blocks to be restricted to particular
524  username patterns, probably by adding a users: key to the auth block
525  that matches similar to hosts:.
526
527* It would be nice to have bindaddress (and bindaddress6) as a peer block
528  parameter and not just a global parameter in innfeed.
529
530* Add cnfsstat to innstat.  cnfsstat really needs a more succinct mode
531  before doing this, since right now the output can be quite verbose.
532
533
534General Projects
535
536* All the old packages in unoff-contrib should be reviewed for integration
537  into INN.
538
539* It may be better for INN on SysV-derived systems to use poll rather than
540  select.  The semantics are better, and on some systems (such as Solaris)
541  select is limited to 1024 file descriptors whereas poll can handle any
542  number.  Unfortunately, the API is drastically different between the
543  two and poll isn't portable, so supporting both cleanly would require a
544  bit of thought.
545
546* Currently only innd and innfeed increase their file descriptor limits.
547  Other parts of INN, notably makehistory, may benefit from doing the same
548  thing if they can without root privileges.
549
550* Revisit support for aliased groups and what nnrpd does with them.
551  Should posts to the alias automatically be redirected to the real group?
552  Regardless, the error return should provide useful information about
553  where to post instead.  Also, the new overview API, for at least some of
554  the overview methods, truncated the group status at one character and
555  lost the name of the group to which a group is aliased; that needs to be
556  fixed.
557
558* More details as to why a message ID is bad would be useful to return to
559  the user, particularly for rnews, inews, etc.
560
561* Support putting the history file in different directory from the other
562  (much smaller) db files without hand-editing a bunch of files.
563
564* frontends/pullnews and frontends/nntpget should be merged in one script.
565
566* backends/filechan is just a simple version of backends/buffchan.  It
567  looks like filechan could just be deleted and innupgrade taught to change
568  filechan feeds to buffchan -u feeds.  map.c, which is only used by those
569  two programs, could just be included in the same source file.
570
571* actsyncd could stand a rewrite and cleaner handling of both
572  configuration and syncing against multiple sources which are canonical
573  for different sets of groups.  In the process, FTP handling should
574  support debugging.
575
576* send-nntp and nntpsend basically do the same thing; send-nntp could
577  probably be removed (possibly with some extra support in nntpsend for
578  doing simpler things).
579
580
581Long-Term Projects
582
583* Look at turning header parsing into a library of some sort.  Lots of INN
584  does this, but different parts of INN need subtly different things, so
585  the best API is unclear.
586
587* INN's header handling needs to be checked against RFC 5536 and 5537.
588  This may want wait until after we have a header parsing library.
589
590* The innd filter should be able to specify additional or replacement
591  groups into which an article should be filed, or even spool the article
592  to a local disk file rather than storing it.  (See the stuff that the
593  nnrpd filter can already do.)
594
595* When articles expire out of a storage method with self-expire
596  functionality, the overview and history entries for those articles
597  should also be expired immediately.  Otherwise, things like the GROUP
598  command don't give the correct results.  This will likely require a
599  callback that can be passed to CNFS that is called to do the overview
600  and history cleanup for each article overwritten.
601
602* Feed control, namely allowing your peers to set policy on what articles
603  you feed them (not just newsgroups but max article size and perhaps even
604  filter properties like "non-binary").  Every site does this a bit
605  differently.  Some people have web interfaces, some people use GUP, some
606  people roll their own alternate things.  It would really be nice to have
607  some good way of doing this as part of INN.  It's worth considering an
608  NNTP extension for this purpose, although the first step is to build a
609  generic interface that an NNTP extension, a web page, etc. could all
610  use.  (An alternate way of doing this would be to extend IHAVE to pass
611  the list of newsgroups as part of the command, although this doesn't
612  seem as generally useful.)
613
614* Traffic classification as an extension of filtering.  The filter should
615  be able to label traffic as binary (e.g.) without rejecting it, and
616  newsfeeds should be extended to allow feeding only non-binary articles
617  (e.g.) to a peer.
618
619* External authenticators should also be able to do things like return a
620  list of groups that a person is allowed to read or post to.  Currently,
621  maintaining a set of users and a set of groups, each of which some
622  subset of the users is allowed to access, is far too difficult.  For a
623  good starting list of additional functionality that should be made
624  available, look at everything the Perl authentication hooks can do.
625
626* Allow nnrpd to spawn long-running helper processes.  Not only would this
627  be useful for handling authentication (so that the auth hooks could work
628  without execing a program on every connection), but it may allow for
629  other architectures for handling requests (such as a pool of helpers
630  that deal only with overview requests).  More than that, nnrpd should
631  *be* a long-running helper process that innd can feed open file
632  descriptors to.  [Aidan Culley has ideas along these lines.]
633
634* The tradspool storage method requires assigning a number to every
635  newsgroup (for use in a token).  Currently this is maintained in a
636  separate tradspool.map file, but it would be much better to keep that
637  information in the active file where it can't drop out of sync.  A code
638  assigned to each newsgroup would be useful for other things as well,
639  such as hashing the directories for the tradindexed overview.  For use
640  for that purpose, though, the active file would have to be extended to
641  include removed groups, since they'd need to be kept in the active file
642  to reserve their numbers until the last articles expired.
643
644* The locking of the active file leaves something to be desired; in
645  general, the locking in INN (for the active file, the history file,
646  spool updates, overview updates, and the like) needs a thorough
647  inspection and some cleanup.  A good place to start would be tracing
648  through the pause and throttle code and write up a clear description of
649  what gets locked where and what is safely restarted and what isn't.
650  Long term, there needs to be a library locking routine used by
651  *everything* that needs to write to the history file, active file, etc.
652  and that keeps track of the PID of the process locking things and is
653  accessible via ctlinnd.
654
655* There is a fundamental problem with the current design of the
656  control.ctl file.  It combines two things:  A database of hierarchies,
657  their maintainers, and related information, and a list of which
658  hierarchies the local server should honor.  These should be separated
659  out into the database (which could mostly be updated from a remote
660  source like ftp.isc.org and then combined with local additions) and a
661  configured list of hierarchies (or sub-hierarchies within hierarchies)
662  that control messages should be honored for.  This should be reasonably
663  simple although correct handling of checkgroups could get a mite tricky.
664
665* Possible NNTP extension:  Compression of the protocol, using gzip,
666  bzip2, or some other technique.  Particularly useful for long lists like
667  the active file information or the overview information, but possibly
668  useful in general for other things.
669
670* Install wizards.  Configuring INN is currently very complex even for an
671  experienced news admin, and there are several fairly standard
672  configurations that shouldn't be nearly that complicated to get running
673  out of the box.  A little interactive Perl script asking some simple
674  questions could probably get a lot of cases easily right.
675
676* One ideally wants to be able to easily convert between different
677  overview formats or storage methods, refiling articles in place.  This
678  should be possible once we have a history API that allows changing the
679  storage location of an article in-place.
680
681* Set up the infrastructure required so that INN can use alloca.  This
682  would significantly decrease the number of calls to malloc needed and
683  would be a lot more convenient.  alloca is now available, but most
684  programs still need to call alloca_free in their main loops before we
685  can use it in the various libraries.
686
687* Support building in a separate directory than the source tree.  It may
688  be best to just support this via lndir rather than try to do it in
689  configure, but it would be ideal to add support for this to the autoconf
690  system.  Unfortunately, the standard method requires letting configure
691  generate all of the makefiles, which would make running configure and
692  config.status take much longer than it does currently.
693
694* Look at adding some kind of support for MODE CANCEL via network sockets
695  and fixing up the protocol so that it could possibly be standardized
696  (the easiest thing to do would probably be to change it into a CANCEL
697  command).  If we want to get to the point where INN can accept and even
698  propagate such feeds from dedicated spam filters or the like, there must
699  also be some mechanism of negotiating policy in order to decide what
700  cancels the server wants to be fed.
701
702* The "possibly signed" char data type is one of the inherent flaws of C.
703  Some other projects have successfully gotten completely away from this
704  by declaring all of their strings to be unsigned char, defining a macro
705  like U that casts strings to unsigned char for use with literal strings,
706  and always using unsigned char everywhere.  Unfortunately, this also
707  requires wrappering all of the standard libc string functions, since
708  they're prototyped as taking char rather than unsigned char.  The
709  benefits include cleaner and consistent handling of characters over 127,
710  better warnings from the compiler, consistent behavior across platforms
711  with different notions about the signedness of char, and the elimination
712  of warnings from the <ctype.h> macros on platforms like Solaris where
713  those macros can't handle signed characters.  We should look at doing
714  this for INN.
715
716* It would clean up a lot of code considerably if we could just use mmap
717  semantics regardless of whether the system has mmap.  It may be possible
718  to emulate mmap on systems that don't have it by reading the entirety of
719  the file into memory and setting the flags that require things to call
720  mmap_flush and mmap_invalidate on a regular basis, but it's not clear
721  where to stash the file descriptor that corresponds to the mapped file.
722
723* Consider replacing the awkward access: parameter in readers.conf with
724  separate commands (e.g. "allow_newnews: true") or otherwise cleaning up
725  the interaction between access: and read:/post:.  Note that at least
726  allownewnews: can be treated as a setting for overriding inn.conf and
727  should be very easy to add.
728
729* Add a localport: parameter (similar to localaddress:) to readers.conf
730  auth groups.  With those two parameters (and ssl_required:) we
731  essentially eliminate the need to run multiple instances of nnrpd just to
732  use different configurations.
733
734* Various things may break when trying to use data written while compiled
735  with large file support using a server that wasn't so compiled (and vice
736  versa).  The main one is the history file, but tradindexed is also
737  affected and buffindexed has been reported to have problems with this
738  as well.  Ideally, all of INN's data files should be as portable as
739  possible.
740
741
742Code Reorganization
743
744* storage should be reserved just for article storage; the overview
745  methods should be in a separate overview tree.
746
747* The split between frontends and backends is highly non-intuitive.  Some
748  better organization scheme should be arrived at.  Perhaps something
749  related to incoming and outgoing, with programs like cnfsstat moved into
750  the storage directory with the other storage-related code?
751
752* Add a separate utils directory for things like convdate, shlock,
753  shrinkfile, and the like.  Some of the scripts may possibly want to go
754  into that directory too.
755
756* The lib directory possibly should be split so that it contains only code
757  always compiled and part of INN, and the various replacements for
758  possibly missing system routines are in a separate directory (such as
759  replace).  These should possibly be separate libraries; there are things
760  that currently link against libinn that only need the portability
761  pieces.
762
763* The doc directory really should be broken down further by type of
764  documentation or section or something; it's getting a bit unwieldy.
765
766* Untabify and reformat all of the code according to a consistent coding
767  style which would then be enforced for all future check-ins.
768