1This is a rough and informal list of suggested improvements to INN, parts 2of INN that need work, and other tasks yet undone. Some of these may be 3in progress, in which case the person working on them will be noted in 4square brackets and should be contacted if you want to help. Otherwise, 5let <inn-workers@lists.isc.org> know if you'd like to work on any item listed 6below. 7 8This list is currently being migrated to INN's issue tracking system 9<https://inn.eyrie.org/trac/>. 10 11The list is divided into changes already tentatively scheduled for a 12particular release, higher priority changes that will hopefully be done in 13the near future, small or medium-scale projects for the future, and 14long-term, large-scale problems. Note that just because a particular 15feature is scheduled for a later release doesn't mean it can't be 16completed earlier if someone decides to take it on. The association of 17features with releases is intended to be a rough guide for prioritization 18and a set of milestones to use to judge when a new major release is 19justified. 20 21Also, one major thing that is *always* welcome is additions to the test 22suite, which is currently very minimal. Any work done on the test suite 23to allow more portions of INN to be automatically tested will make all 24changes easier and will be *greatly* appreciated. 25 26Last modified $Id: TODO 9949 2015-09-12 13:48:45Z iulius $. 27 28 29Scheduled for INN 2.7 30 31* Rework and clean up the overview API. The major change is that the 32 initialization function should return a pointer to an opaque struct 33 which stores all of the state of the overview subsystem, rather than all 34 of that being stored in static variables, and then all other functions 35 should take that pointer. [The new API is available in CURRENT and 36 conversion of portions of INN to use it is in progress. Once that 37 conversion is done, the old API can be dropped and the overview backends 38 converted to speak the new API at all levels.] 39 40* The TLS layer in nnrpd badly needs to be rewritten. Currently, it 41 doesn't handle timeouts and the code could be cleaned up considerably. 42 Support for GnuTLS would also be nice, as would client support. 43 44* Convert readers.conf and storage.conf (and related configuration files) 45 to use the new parsing system and break out program-specific sections 46 of inn.conf into their own groups. 47 48* The current WIP cache and history cache should be integrated into the 49 history API, things like message ID hashing should become a selectable 50 property of the history file, and the history API should support 51 multiple backend storage formats and automatically select the right one 52 for an existing history file based on stored metainformation. 53 54* The interface to embedded filters needs to be reworked. The information 55 about which filters are enabled should be isolated in the filtering API, 56 and there should be standard API calls for filtering message IDs, remote 57 posts, and local posts. As part of this revision, all of the Perl 58 callbacks should be defined before any of the user code is loaded, and 59 the Perl loading code needs considerable cleanup. At the same time as 60 this is done, the implementation should really be documented; we do some 61 interesting things with embedded filters and it would be nice to have a 62 general document describing how we do it. [Russ is planning on working 63 on this at some point, but won't get upset if someone starts first.] 64 65* All of INN's documentation should be written in POD, with text and man 66 pages generated from the POD source. Anyone is encouraged to work on 67 this by just taking any existing documentation in man format and convert 68 it to POD while checking that it's still accurate and adding any 69 additional useful information that was missed. 70 71* Include the necessary glue so that Perl modules can be added to INN's 72 build tree and installed with INN, allowing their capabilities to be 73 available to the portions of INN written in Perl. 74 75* Switch nnrpd over to using the new wildmat routines rather than breaking 76 apart strings on commas and matching each expression separately. This 77 involves a lot of surgery, since PERMmatch is used all over the place, 78 and may change the interpretation of ! and @ in group permission 79 wildmats. 80 81* Rework and clean up the storage API. The major change is that the 82 initialization function should return a pointer to an opaque struct 83 which stores all of the state of the storage subsystem, rather than all 84 of that being stored in static variables, and then all other functions 85 should take that pointer. More of the structures should also be opaque, 86 all-caps structure names should be avoided in favor of named structures, 87 SMsetup and SMinit should be combined into one function that takes 88 flags, SMerrno and SMerrorstr should be replaced with functions that 89 return that information, and the wire format utilities should be moved 90 into libinn. 91 92 93Scheduled for INN 2.8 94 95* Add a generic, modular anti-spam and anti-abuse filter, off by default, 96 but coming with INN and prominently mentioned in the INSTALL 97 documentation. [Andrew Gierth has work in progress that may be usable 98 for this.] 99 100* A unified configuration file combining the facilities of newsfeeds, 101 incoming.conf, and innfeed.conf, but hopefully more readable and easier 102 for new INN users to edit. This should have all of the capabilities of 103 the existing configuration files, but specifying common things (such as 104 file feeds or innfeed feeds) should be very simple and straightforward. 105 This configuration file should use the new parsing infrastructure. 106 107* Convert all remaining INN configuration files to the new parsing 108 infrastructure. 109 110* INN really should be capable of both sending and receiving a 111 headers-only feed (or even an overview-only feed) similar to Diablo and 112 using it for the same things that Diablo does, namely clustering, 113 pull-on-demand for articles, and the like. This should be implementable 114 as a new backend, although the API may need a few more hooks. Both a 115 straight headers-only feed that only pulls articles down via NNTP from a 116 remote server and a caching feed where some articles are pre-fed, some 117 articles are pulled down at first read, and some articles are never 118 stored locally should be possible. [A patch for a header-only feed 119 from innfeed is included in ticket #17.] 120 121* The libinn, libstorage, and other library interfaces should be treated 122 as stable libraries and properly versioned using libtool's 123 recommendation for library versioning when changes are made so that they 124 can be installed as shared libraries and work properly through releases 125 of INN. This is currently waiting on a systematic review of the 126 interface and removal of things that we don't want to support long-term. 127 128* The include files necessary to use libinn, libstorage, and other 129 libraries should be installed in a suitable directory so that other 130 programs can link against them. All such include files should be under 131 include/inn and included with <inn/header.h>. All such include files 132 should only depend on other inn/* header files and not on, e.g., 133 config.h. All such include files should be careful about namespace to 134 avoid conflicts with other include files used by applications. 135 136 137High Priority Projects 138 139* INN shouldn't flush all feeds (particularly all program feeds) on 140 newgroup or rmgroup. Currently it reloads newsfeeds to reparse all of 141 the wildmat patterns and rebuild the peer lists associated with the 142 active file on group changes, and this forces a flush of all feeds. 143 The best fix is probably to stash the wildmat pattern (and flags) for 144 each peer when newsfeeds is read and then just using the stashed copy on 145 newgroup or rmgroup, since otherwise the newsfeeds loading code would 146 need significant modification. But in general, innd is too 147 reload-happy; it should be better at making incremental changes without 148 reloading everything. 149 150* Add authenticated Path support, based on USEPRO (RFC 5537). 151 [Andrew Gierth wrote a patch for part of this a while back, which 152 is included in ticket #19. Marco d'Itri expressed some interest in 153 working on this.] 154 155* Various parts of INN are using write or writev; they should all use 156 xwrite or xwritev instead. Even for writes that are unlikely to ever be 157 partial, on some systems system calls aren't restartable and xwrite and 158 xwritev properly handle EINTR returns. 159 160* Apparently on Solaris open can also be interrupted by a signal; we may 161 need to have an xopen wrapper that checks for EINTR and retries. 162 163* tradspool has a few annoying problems. Deleted newsgroups never have 164 their last articles expired, and there is no way of forcibly 165 resynchronizing the articles stored on disk with what overview knows 166 about unless tradindexed is used. Some sort of utility program to take 167 care of these and to do things like analyze the tradspool.map file 168 should be provided. 169 170* contrib/mkbuf and contrib/reset-cnfs.c should be combined into a utility 171 for creating and clearing cycbuffs, perhaps combined with cnfsheadconf, 172 and the whole thing moved into storage/cnfs rather than frontends (along 173 with cnfsstat). pullart.c may also stand to be merged into the same 174 utility (cnfs-util might not be a bad name). 175 176* The Berkeley DB integration of INN needs some improvements in robustness. 177 Currently, BerkeleyDB functions can be called by nnrpd out of signal 178 handlers and in other unfortunate situations, and coordination between 179 nnrpd and innd isn't entirely robust. Berkeley DB 4.4 offers a new 180 DB_REGISTER flag to open to allow for multi-process use of Berkeley DB 181 databases and use of that flag should be investigated. 182 183 184Documentation Projects 185 186* Add man pages for all libinn interfaces. There should be a subdirectory 187 of doc/pod for this since there will be a lot of them; installing them 188 as libinn_<section>.3 seems to make the most sense (so, for example, 189 error handling routines would be documented in libinn_error.3). 190 191* Better documentation of and support for UUCP feeds. send-uucp is now 192 easier to use, but there's still a paucity of documentation covering the 193 whole theory and mechanisms of UUCP feeding. 194 195* Everything installed by INN should have a man page. Currently, there 196 are several binaries and configuration files that don't have man pages. 197 (In some cases, the best thing to do with the configuration file may be 198 to merge it into another one or find a way to eliminate it.) 199 200* Document the internal formats of the various overview methods, CNFS, 201 timehash, and timecaf. A lot of this documentation already exists in 202 various forms, but it needs to be cleaned up and collected in one place 203 for each format, preferrably as a man page. 204 205* Add documentation for slave servers. [Russ has articles from 206 inn-workers that can be used as a beginning.] 207 208* Write complete documentation for all of our extensions to RFC 3977 or RFC 209 5536 and 5537, preferrably in a format that could be suitable for future 210 inclusion into new revisions of the RFCs. 211 212* More comprehensive documentation in texinfo would be interesting; it 213 would allow for better organization, separation of specialized topics 214 into cleaner chapters, and a significantly better printed manual. This 215 would be a tremendous amount of work, though. 216 217 218Code Cleanup Projects 219 220* Eliminate everything in the LEGACY section of include/inn/options.h. 221 222* Go over include/inn/options.h and try to eliminate as many of the 223 compile-time options there as possible. They should all be run-time 224 options instead if at all possible, maybe in specific sub-sections of 225 inn.conf. 226 227* Check to be sure we still need all of the #defines in 228 include/inn/paths.h and look at adding anything needed by innfeed (and 229 eliminating the separate innfeed header serving the same purpose). 230 231* Use vectors or cvectors everywhere that argify and friends are currently 232 used and eliminate the separate implementation in nnrpd/misc.c. 233 234* Break up the remainder of inn/libinn.h into multiple inn/* include files 235 for specific functions (such as memory management, wildmat, date handling, 236 NNTP commands, etc.), with an inn/util.h header to collect the remaining 237 random utilities. Consider adding some sort of prefix, like inn_, to all 238 functions that aren't part of some other logical set with its own prefix. 239 240* Break the CNFS and tradspool code into multiple source files to make it 241 easier to understand the logical divisions of the code and consider 242 doing the same with the other overview and storage methods. 243 244* The source and bind addresses currently also take "any" or "all" 245 wildcards, which can more easily be specified by just not setting them 246 at all. Remove those special strings, modify innupgrade to fix inn.conf 247 files using them, and simplify the code. (It's not completely clear 248 that this is the right thing to do.) 249 250 251Needed Bug Fixes 252 253* Don't require an Xref slave carry all of the groups of its upstream. 254 Fixing this will depend on the new overview API (so that overview 255 records are stored separately in each group under innd's control) and 256 will require ignoring failures to store overview records because the 257 group doesn't exist, or checking first to ensure the group exists 258 before trying to store the record. 259 260* tradspool currently uses stdio to write out tradspool.map, which can 261 cause problems if more than 256 file descriptors are in use for other 262 things (such as incoming connections or tradindexed overview cache). 263 It should use write() instead. 264 265* LIST NEWSGROUPS should probably only list newsgroups that are marked in 266 the active file as valid groups. 267 268* INN's startup script should be sure to clean out old lock files and PID 269 files for innfeed. Be careful, though, since innfeed may still be 270 running, spawned from a previous innd. 271 272* makedbz should be more robust in the presence of malformed history 273 lines, discarding with them or otherwise dealing with them. 274 275* Some servers reject some IHAVE, TAKETHIS, or CHECK commands with 500 276 syntax errors (particularly for long message IDs), and innfeed doesn't 277 handle this particularly well at the moment. It really should have an 278 error handler for this case. [Sven Paulus has a preliminary patch that 279 needs testing, included in ticket #26.] 280 281* Editing the active file by hand can currently munge it fairly badly even 282 if the server is throttled unless you reload active before restarting 283 the server. This could be avoidable for at least that particular case 284 by checking the mtime of active before and after the server was 285 throttled. 286 287* innreport silently discards news.notice entries about most of the errors 288 innfeed generates. It should ideally generate some summary, or at least 289 note that some error has occurred and the logs should be examined. 290 291* Handling of compressed batches needs to be thoroughly reviewed by 292 someone who understands how they're supposed to work. It's not clear 293 that INN_PATH_GZIP is being used correctly at the moment and that 294 compressed batch handling will work right now on systems that don't have 295 gzip installed (but that do have uncompress). 296 297* innfeed's statistics don't add up properly all the time. All of the 298 article dispositions don't add up to the offered count like they should. 299 Some article handling must not be recorded properly. 300 301* If a channel feed exits immediately, innd respawns it immediately, 302 causing thrashing of the system and a huge spew of errors in syslog. It 303 should mark the channel as dormant for some period of time before 304 respawning it, perhaps only if it's already died multiple times in a 305 short interval. 306 307* ctlinnd begin <site-name> was causing innd to core dump. 308 309* Handling of innfeed's dropped batches needs looking at. There are three 310 places where articles can fall between the cracks: an innfeed.togo file 311 written by innd when the feed can't be spawned, a batch file named after 312 the feed name which can be created under similar circumstances, and the 313 dropped files written by innfeed itself. procbatch can clean these up, 314 but has to be run by hand. 315 316* When using tradspool, groups are not immediately added to tradspool.map 317 when created, making innfeed unable to find the articles until after 318 some period of time. Part of the problem here is that tradspool only 319 updates tradspool.map on a lazy basis, when it sees an article in that 320 group, since there is no storage hook for creation of a new group. 321 322* nntpget doesn't handle long lines in messages. 323 324* WP feeds break if there are spaces in the Path header, and the inn.conf 325 parser doesn't check for this case and will allow people to configure 326 their server that way. (It's not clear that the latter is actually a 327 bug, given the new USEFOR attempt to allow folding of Path headers, but 328 the space needs to be removed for WP feeds.) 329 330* innd returns 437 for articles that were accepted but filed in the junk 331 group. It should probably return the appropriate 2xx status code in 332 that case instead. 333 334* SIGPIPE handling in nnrpd calls all sorts of functions that shouldn't be 335 called from inside a signal handler. 336 337* Someone should go through the BUGS sections of all of the man pages and 338 fix those for which the current behavior is unacceptable. 339 340* The ovdb server utilities don't handle unclean shutdowns very well. 341 They may leave PID files sitting around, fail to start properly, and 342 otherwise not do what's expected. This area needs some thought and a 343 careful design. 344 345* tdx-util can't fix hash chain loops of length greater than one, and they 346 cause both tdx-util -F and innd to hang. 347 348* CNFS is insufficiently robust when corrupt buffers are encountered. 349 Setting cnfscheckfudgesize clears up issues that otherwises causes INN 350 to crash. 351 352* There should be a way, with the Perl authentication hooks, to either 353 immediately return a list of newsgroups one has access to based on the 354 hostname or to indicate that authentication is required and make the 355 user be prompted with a 480 code when they try to access anything. 356 Right now, that doesn't appear to be possible. 357 358* Handling of article rejections needs a lot of cleanup in innd/art.c so 359 that one can cleanly indicate whether a given rejection should be 360 remembered or not, and the code that does the remembering needs to be 361 refactored and shared. Once this is done, we need to not remember 362 rejections for duplicated Xref headers. 363 364* rnews's error handling for non-zero exits by child batch decompressers 365 could use some improvement (namely, there should be some). Otherwise, 366 when gzip fails for some reason, we read zero bytes and then throw 367 away the batch without realizing we have an error. 368 369 370Requested New Features 371 372* Consider implementing the HEADERS command as discussed rather 373 extensively in news.software.nntp. HEADERS was intended as a general 374 replacement for XHDR and XPAT. [Greg Andruk has a preliminary patch.] 375 376* There have been a few requests for the ability to programmatically set 377 the subject of the report generated by news.daily, with escapes that are 378 filled in by the various pieces of information that might be useful. 379 380* A bulk cancel command using the MODE CANCEL interface. Possibly through 381 ctlinnd, although it may be a bit afield of what ctlinnd is currently 382 for. 383 384* Sven Paulus's patch for nnrpd volume reports should be integrated. 385 [Patch included in ticket #135.] 386 387* Lots of people encrypt Injection-Info in various ways. Should that be 388 offered as a standard option? The first data element should probably 389 remain unencrypted so that the O flag in newsfeeds doesn't break. 390 391 Should there also be an option not to generate Injection-Info? 392 393 Olaf Titz suggests for encryption: 394 395 This can be done by formatting injection fields in a way that they 396 are always a multiple of 8 bytes and applying a 64 bit block cipher 397 in ECB mode on it (for instance "395109AA000016FF"). 398 399* ctlinnd flushlogs currently renames all of the log files. It would be 400 nice to support the method of log rotation that most other daemons 401 support, namely to move the logs aside and then tell innd to reopen its 402 log files. Ideally, that behavior would be triggered with a SIGHUP. 403 scanlogs would have to be modified to handle this. 404 405 The best way to support this seems to be to leave scanlogs as is by 406 default, but also add two additional modes. One would flush all the 407 logs and prepare for the syslog logs to be rotated, and the other would 408 do all the work needed after the logs have been rotated. That way, if 409 someone wanted to plug in a separate log rotation handler, they could do 410 so and just call scanlogs on either side of it. The reporting portions 411 of scanlogs should be in a separate program. 412 413* Several people have Perl interfaces to pieces of INN that should ideally 414 be part of the INN source tree in some fashion. Greg Andruk has a bunch 415 of stuff that Russ has copies of, for example. [Patches included in 416 tickets #133 and #134.] 417 418* There are various available patches for Cancel-Lock and an Internet 419 draft; support should be added to INN for both generation and 420 verification (definitely optional and not on by default at this point). 421 422* It would be nice to be able to reload inn.conf (although difficult, due 423 to the amount of data that's generated from it and stashed in various 424 places). 425 426* remembertrash currently rejects and remembers articles with syntax 427 errors as well as things like unwanted newsgroups and unwanted 428 distributions, which means that if a peer sends you a bunch of mangled 429 articles, you'll then also reject the correct versions of the articles 430 from other peers. This should probably be rethought. 431 432* Additional limits for readers.conf: Limit on concurrent parallel reader 433 streams, limit on KB/second download (preliminary support for this is 434 already in), and a limit on maximum posted articles per day (tied in 435 with the backoff stuff?). These should be per-IP or per-user, but 436 possibly also per-access group. (Consider pulling the -H, -T, -X, and 437 -i code out from innd and using it here.) 438 439* timecaf should have more configurable parameters (at the least, how 440 frequently to switch to a new CAF file should be an option). 441 storage.conf should really be extended to allow method-specific 442 configuration for things like this (and to allow the cycbuff.conf file 443 to be merged into storage.conf). 444 445* Allow generation of arbitrary additional information that could go in 446 overview by using embedded Perl or Python code. This might be a cleaner 447 way to do the keywords code, which really wants Perl's regex engine 448 ideally. It would also let one do something like doing MD5 hashes of 449 each article and putting that in the overview if you care a lot about 450 making sure that articles aren't corrupted. 451 452* Allow some way of accepting articles regardless of the Date header, even 453 if it's far into the future. Some people are running into articles that 454 are dated years into the future for some reason that they still want to 455 store on the server. 456 457* There was a request to make --program-suffix and the other name 458 transformation options to autoconf work. The standard GNU package does 459 this with really ugly sed commands in the Makefile rules; we could 460 probably do better, perhaps by substituting the autoconf results into 461 support/install-sh. 462 463* INN currently uses hash tables to store the active file internally. It 464 would be worth trying ternary search trees to see if they're faster; the 465 data structure is simpler, performance may be comparable for hits and 466 significantly better for misses, sizing and resizing becomes a non-issue, 467 and the space penalty isn't too bad. A generic implementation is already 468 available in libinn. (An even better place to use ternary search trees 469 may be the configuration parser.) 470 471* Provide an innshellvars equivalent for Python. 472 473* inncheck should check the syntax of all the various files that are 474 returned by LIST commands, since having those files present with the 475 wrong syntax could result in non-compliant responses from the server. 476 Possibly the server should also refuse to send malformatted lines to 477 the client. 478 479* ctlinnd reload incoming.conf could return a count of the hosts that 480 failed, or even better a list of them. This would make pruning old 481 stuff out of incoming.conf much easier. 482 483* nnrpd could use sendfile(2), if available, to send articles directly 484 to the socket (for those storage methods where to-wire conversion is 485 not needed). This would need to be added to the storage API. 486 487* Somebody should look at keeping the "newsgroups" file more accurate 488 (e.g. newgroups for existing groups should change description, better 489 checkgroups handling, checking for duplicates). 490 491* The by-domain statistics innreport generates for nnrpd count all local 492 connections (those with no "." in the hostname) in with the errors as 493 just "?". The host2dom function could be updated to group these as 494 something like "Local". 495 496* news.daily could detect if expire segfaults and unpause the server. 497 498* When using SSL, track the amount of data that's been transferred to the 499 client and periodically renegotiate the session key. 500 501* When using SSL, use SSL_get_peer to get a verified client certificate, 502 if available, and use it to create an additional header line when 503 posting articles (X-Auth-Poster?). This header could use: 504 505 X509_NAME_oneline(X509_get_subject_name(peer),...) 506 507 for the full distinguished name, or 508 509 X509_name_get_text_by_NID(X509_get_subject_name(peer), 510 NID_commonName, ...) 511 512 for the client's "common name" alone. 513 514* When using SSL, use the server's key to generate an HMAC of the body of 515 the message (and most headers?), then include that digest in the 516 headers. This allows a news administrator to determine if a complaint 517 about the content of a message is fraudulent since the message was 518 changed after transmission. 519 520* Allow permission for posting cancels to be configured in readers.conf 521 in an access block. 522 523* Allow the applicability of auth blocks to be restricted to particular 524 username patterns, probably by adding a users: key to the auth block 525 that matches similar to hosts:. 526 527* It would be nice to have bindaddress (and bindaddress6) as a peer block 528 parameter and not just a global parameter in innfeed. 529 530* Add cnfsstat to innstat. cnfsstat really needs a more succinct mode 531 before doing this, since right now the output can be quite verbose. 532 533 534General Projects 535 536* All the old packages in unoff-contrib should be reviewed for integration 537 into INN. 538 539* It may be better for INN on SysV-derived systems to use poll rather than 540 select. The semantics are better, and on some systems (such as Solaris) 541 select is limited to 1024 file descriptors whereas poll can handle any 542 number. Unfortunately, the API is drastically different between the 543 two and poll isn't portable, so supporting both cleanly would require a 544 bit of thought. 545 546* Currently only innd and innfeed increase their file descriptor limits. 547 Other parts of INN, notably makehistory, may benefit from doing the same 548 thing if they can without root privileges. 549 550* Revisit support for aliased groups and what nnrpd does with them. 551 Should posts to the alias automatically be redirected to the real group? 552 Regardless, the error return should provide useful information about 553 where to post instead. Also, the new overview API, for at least some of 554 the overview methods, truncated the group status at one character and 555 lost the name of the group to which a group is aliased; that needs to be 556 fixed. 557 558* More details as to why a message ID is bad would be useful to return to 559 the user, particularly for rnews, inews, etc. 560 561* Support putting the history file in different directory from the other 562 (much smaller) db files without hand-editing a bunch of files. 563 564* frontends/pullnews and frontends/nntpget should be merged in one script. 565 566* backends/filechan is just a simple version of backends/buffchan. It 567 looks like filechan could just be deleted and innupgrade taught to change 568 filechan feeds to buffchan -u feeds. map.c, which is only used by those 569 two programs, could just be included in the same source file. 570 571* actsyncd could stand a rewrite and cleaner handling of both 572 configuration and syncing against multiple sources which are canonical 573 for different sets of groups. In the process, FTP handling should 574 support debugging. 575 576* send-nntp and nntpsend basically do the same thing; send-nntp could 577 probably be removed (possibly with some extra support in nntpsend for 578 doing simpler things). 579 580 581Long-Term Projects 582 583* Look at turning header parsing into a library of some sort. Lots of INN 584 does this, but different parts of INN need subtly different things, so 585 the best API is unclear. 586 587* INN's header handling needs to be checked against RFC 5536 and 5537. 588 This may want wait until after we have a header parsing library. 589 590* The innd filter should be able to specify additional or replacement 591 groups into which an article should be filed, or even spool the article 592 to a local disk file rather than storing it. (See the stuff that the 593 nnrpd filter can already do.) 594 595* When articles expire out of a storage method with self-expire 596 functionality, the overview and history entries for those articles 597 should also be expired immediately. Otherwise, things like the GROUP 598 command don't give the correct results. This will likely require a 599 callback that can be passed to CNFS that is called to do the overview 600 and history cleanup for each article overwritten. 601 602* Feed control, namely allowing your peers to set policy on what articles 603 you feed them (not just newsgroups but max article size and perhaps even 604 filter properties like "non-binary"). Every site does this a bit 605 differently. Some people have web interfaces, some people use GUP, some 606 people roll their own alternate things. It would really be nice to have 607 some good way of doing this as part of INN. It's worth considering an 608 NNTP extension for this purpose, although the first step is to build a 609 generic interface that an NNTP extension, a web page, etc. could all 610 use. (An alternate way of doing this would be to extend IHAVE to pass 611 the list of newsgroups as part of the command, although this doesn't 612 seem as generally useful.) 613 614* Traffic classification as an extension of filtering. The filter should 615 be able to label traffic as binary (e.g.) without rejecting it, and 616 newsfeeds should be extended to allow feeding only non-binary articles 617 (e.g.) to a peer. 618 619* External authenticators should also be able to do things like return a 620 list of groups that a person is allowed to read or post to. Currently, 621 maintaining a set of users and a set of groups, each of which some 622 subset of the users is allowed to access, is far too difficult. For a 623 good starting list of additional functionality that should be made 624 available, look at everything the Perl authentication hooks can do. 625 626* Allow nnrpd to spawn long-running helper processes. Not only would this 627 be useful for handling authentication (so that the auth hooks could work 628 without execing a program on every connection), but it may allow for 629 other architectures for handling requests (such as a pool of helpers 630 that deal only with overview requests). More than that, nnrpd should 631 *be* a long-running helper process that innd can feed open file 632 descriptors to. [Aidan Culley has ideas along these lines.] 633 634* The tradspool storage method requires assigning a number to every 635 newsgroup (for use in a token). Currently this is maintained in a 636 separate tradspool.map file, but it would be much better to keep that 637 information in the active file where it can't drop out of sync. A code 638 assigned to each newsgroup would be useful for other things as well, 639 such as hashing the directories for the tradindexed overview. For use 640 for that purpose, though, the active file would have to be extended to 641 include removed groups, since they'd need to be kept in the active file 642 to reserve their numbers until the last articles expired. 643 644* The locking of the active file leaves something to be desired; in 645 general, the locking in INN (for the active file, the history file, 646 spool updates, overview updates, and the like) needs a thorough 647 inspection and some cleanup. A good place to start would be tracing 648 through the pause and throttle code and write up a clear description of 649 what gets locked where and what is safely restarted and what isn't. 650 Long term, there needs to be a library locking routine used by 651 *everything* that needs to write to the history file, active file, etc. 652 and that keeps track of the PID of the process locking things and is 653 accessible via ctlinnd. 654 655* There is a fundamental problem with the current design of the 656 control.ctl file. It combines two things: A database of hierarchies, 657 their maintainers, and related information, and a list of which 658 hierarchies the local server should honor. These should be separated 659 out into the database (which could mostly be updated from a remote 660 source like ftp.isc.org and then combined with local additions) and a 661 configured list of hierarchies (or sub-hierarchies within hierarchies) 662 that control messages should be honored for. This should be reasonably 663 simple although correct handling of checkgroups could get a mite tricky. 664 665* Possible NNTP extension: Compression of the protocol, using gzip, 666 bzip2, or some other technique. Particularly useful for long lists like 667 the active file information or the overview information, but possibly 668 useful in general for other things. 669 670* Install wizards. Configuring INN is currently very complex even for an 671 experienced news admin, and there are several fairly standard 672 configurations that shouldn't be nearly that complicated to get running 673 out of the box. A little interactive Perl script asking some simple 674 questions could probably get a lot of cases easily right. 675 676* One ideally wants to be able to easily convert between different 677 overview formats or storage methods, refiling articles in place. This 678 should be possible once we have a history API that allows changing the 679 storage location of an article in-place. 680 681* Set up the infrastructure required so that INN can use alloca. This 682 would significantly decrease the number of calls to malloc needed and 683 would be a lot more convenient. alloca is now available, but most 684 programs still need to call alloca_free in their main loops before we 685 can use it in the various libraries. 686 687* Support building in a separate directory than the source tree. It may 688 be best to just support this via lndir rather than try to do it in 689 configure, but it would be ideal to add support for this to the autoconf 690 system. Unfortunately, the standard method requires letting configure 691 generate all of the makefiles, which would make running configure and 692 config.status take much longer than it does currently. 693 694* Look at adding some kind of support for MODE CANCEL via network sockets 695 and fixing up the protocol so that it could possibly be standardized 696 (the easiest thing to do would probably be to change it into a CANCEL 697 command). If we want to get to the point where INN can accept and even 698 propagate such feeds from dedicated spam filters or the like, there must 699 also be some mechanism of negotiating policy in order to decide what 700 cancels the server wants to be fed. 701 702* The "possibly signed" char data type is one of the inherent flaws of C. 703 Some other projects have successfully gotten completely away from this 704 by declaring all of their strings to be unsigned char, defining a macro 705 like U that casts strings to unsigned char for use with literal strings, 706 and always using unsigned char everywhere. Unfortunately, this also 707 requires wrappering all of the standard libc string functions, since 708 they're prototyped as taking char rather than unsigned char. The 709 benefits include cleaner and consistent handling of characters over 127, 710 better warnings from the compiler, consistent behavior across platforms 711 with different notions about the signedness of char, and the elimination 712 of warnings from the <ctype.h> macros on platforms like Solaris where 713 those macros can't handle signed characters. We should look at doing 714 this for INN. 715 716* It would clean up a lot of code considerably if we could just use mmap 717 semantics regardless of whether the system has mmap. It may be possible 718 to emulate mmap on systems that don't have it by reading the entirety of 719 the file into memory and setting the flags that require things to call 720 mmap_flush and mmap_invalidate on a regular basis, but it's not clear 721 where to stash the file descriptor that corresponds to the mapped file. 722 723* Consider replacing the awkward access: parameter in readers.conf with 724 separate commands (e.g. "allow_newnews: true") or otherwise cleaning up 725 the interaction between access: and read:/post:. Note that at least 726 allownewnews: can be treated as a setting for overriding inn.conf and 727 should be very easy to add. 728 729* Add a localport: parameter (similar to localaddress:) to readers.conf 730 auth groups. With those two parameters (and ssl_required:) we 731 essentially eliminate the need to run multiple instances of nnrpd just to 732 use different configurations. 733 734* Various things may break when trying to use data written while compiled 735 with large file support using a server that wasn't so compiled (and vice 736 versa). The main one is the history file, but tradindexed is also 737 affected and buffindexed has been reported to have problems with this 738 as well. Ideally, all of INN's data files should be as portable as 739 possible. 740 741 742Code Reorganization 743 744* storage should be reserved just for article storage; the overview 745 methods should be in a separate overview tree. 746 747* The split between frontends and backends is highly non-intuitive. Some 748 better organization scheme should be arrived at. Perhaps something 749 related to incoming and outgoing, with programs like cnfsstat moved into 750 the storage directory with the other storage-related code? 751 752* Add a separate utils directory for things like convdate, shlock, 753 shrinkfile, and the like. Some of the scripts may possibly want to go 754 into that directory too. 755 756* The lib directory possibly should be split so that it contains only code 757 always compiled and part of INN, and the various replacements for 758 possibly missing system routines are in a separate directory (such as 759 replace). These should possibly be separate libraries; there are things 760 that currently link against libinn that only need the portability 761 pieces. 762 763* The doc directory really should be broken down further by type of 764 documentation or section or something; it's getting a bit unwieldy. 765 766* Untabify and reformat all of the code according to a consistent coding 767 style which would then be enforced for all future check-ins. 768