1Omega 1.2.25 (2017-09-26): 2 3documentation: 4 5* Suggest DOCIDORDER=X for DONT_CARE. 6 7portability: 8 9* Fix GCC -Wimplicit-fallthrough warning. 10 11Omega 1.2.24 (2016-09-16): 12 13build system: 14 15* Drop unused configure check for symbol visibility. 16 17Omega 1.2.23 (2016-03-28): 18 19documentation: 20 21* Update links to Xapian website and trac to use https, which is now supported, 22 thanks to James Aylett. 23 24indexers: 25 26* Fix HTML/XML entity decoding to be O(n) not O(n²) - processing HTML/XML with 27 a lot of entities is now much faster. 28 29templates: 30 31* Remove unused country code to name maps. These were intended as examples, 32 but they aren't very useful as such, and really just bloat the templates 33 needlessly. 34 35Omega 1.2.22 (2015-12-29): 36 37documentation: 38 39* Stop maintaining ChangeLog files. They make merging patches harder, and stop 40 'git cherry-pick' from working as it should. The git repo history should be 41 sufficient for complying with GPLv2 2(a). 42 43* Clarify help text for omindex --mime-type option. 44 45* docs/omegascript.rst: 46 47 + Fix documentation of $last to say it's the MSet index *one beyond* the end 48 of the current page. Reported by Andrew Chilton. 49 50 + Clarify that $split and $substr work in bytes. Previously we said 51 "characters" which could be taken as meaning they work with UTF-8 52 characters. 53 54 + Update documentation for $filters - it was missing these CGI parameters 55 from the list of those serialised: COLLAPSE, DOCIDORDER, SORT, SORTREVERSE, 56 SORTAFTER 57 58 + Explicitly note user can use $setmap to create their own maps. 59 60* docs/overview.rst: 61 62 + SVG extraction is built-in too. 63 64 + Expand paragraph about command `false`. Note the versions where explicit 65 support was added, and that this will also work with any version on Unix, 66 where `false` is a command. 67 68 + Document `cdb_dir`. 69 70* docs/cgiparams.rst: Document behaviour if xDB is not set. 71 72* Change "characters" to "bytes" in a few places to clarify that we don't mean 73 Unicode code points. 74 75indexers: 76 77* omindex: 78 79 + Add '--title-size' option. 80 81 + Handle .oft the same way as .msg - it's some sort of template email, and 82 has essentially the same format. 83 84omega: 85 86* Make $querydescription ensure the match has been run, so that it includes 87 filters. 88 89* Avoid $allterms, $cgilist, $filterterms and $terms being O(n²) in the number 90 of items in the returned list. 91 92* If xFILTERS is not set, don't force the first page as that's unhelpful if 93 someone fails to set it in their template. 94 95* When environment variable SERVER_PROTOCOL is set to INCLUDED (as it is when 96 we're being included in a page), we already suppress the HTTP headers, but 97 now we suppress the blank line after the header too. 98 99* Support option flag_cjk_ngram if built against xapian-core >= 1.2.22. 100 101testsuite: 102 103* Add test coverage for parsing of HTML entities. 104 105build system: 106 107* Fix error reporting if PCRE isn't installed. Fixes #693, reported by lhz7370. 108 109portability: 110 111* Avoid warning when building with glibc >= 2.21. 112 113* Don't provide our own implementation of sleep() under __WIN32__ if there 114 already is one - mingw provides one, and in some situations it seems to clash 115 with ours. Reported to xapian-discuss by John Alveris. 116 117* Stop trying to use O_STREAMING - the patch to implement it was never merged 118 into the Linux kernel, and I can't find any evidence that other platforms 119 implement it. The constant value O_STREAMING used now seems to be used for 120 the part of O_SYNC which isn't covered by O_DSYNC, which seems likely to hurt 121 performance if anything. 122 123Omega 1.2.21 (2015-05-20): 124 125documentation: 126 127* docs/overview.rst: Document 'E' prefixed boolean terms for filtering by 128 extension (see #668, reported by bramvdh). 129 130* docs/encodings.rst: Add a document about character encoding, as suggested by 131 James Aylett in #550. 132 133indexers: 134 135* omindex: 136 137 + outlookmsg2html: Fix handling of message/rfc822 subparts. 138 139omega: 140 141* $prettyurl now decodes valid UTF-8 sequences, and some additional ASCII 142 characters in the path part: []@!$&'()*+.;= (Fixes #550 and #644, reported by 143 catkin and terencz.) 144 145* $prettyurl now leaves the query and fragment parts of the URL alone and won't 146 decode an escaped "/" (omindex doesn't create URLs with any of these, so we 147 only risk breaking other URLs which have them). 148 149* Drop compilation date and time from output when run from the command line - 150 they prevent reproducible builds and the version number is sufficient 151 information. 152 153templates: 154 155* templates/query: When listing matching terms, don't make the commas italic. 156 157* templates/query: Eliminate blank line before <html>. 158 159* templates/xml: Add XML declaration. 160 161* templates/godmode: Specify charset utf-8 in the content-type. 162 163build system: 164 165* Link test programs with libtool's '-no-install' or '-no-fast-install', like 166 we already do in xapian-core, which means that libtool doesn't need to 167 generate shell script wrappers for them on most platforms. 168 169portability: 170 171* Add spaces between literal strings and macros which expand to literal strings 172 for C++11 compatibility. 173 174* Remove 'register' as it's deprecated and clang spits out warnings because of 175 that. Any modern compiler likely just ignores it as an optimisation hint 176 anyway. 177 178Omega 1.2.20 (2015-03-04): 179 180documentation: 181 182* docs/cgiparams.rst: Improve wording of docs for SORT parameter. 183 184* docs/omegascript.rst: Update documentation references to DATE1, DATE2, and 185 DAYSMINUS which were renamed in 0.6.x and the compatibility aliases removed 186 in 1.0.0. 187 188indexers: 189 190* omindex: 191 192 + Ignore extensions .msi and .msp, which are Microsoft installer files, but 193 which libmagic sometimes incorrectly identifies as application/msword. 194 195 + Interpret a command of "false" in "--filter" as meaning to ignore files 196 with that MIME type. 197 198omega: 199 200* Handle CGI parameter [=0 as [=1. 201 202templates: 203 204* templates/xml: Update handling of DATE1, DATE2 and DAYSMINUS which were 205 renamed in 0.6.x and the compatibility aliases removed in 1.0.0. 206 207build system: 208 209* configure: Use pkg-config in preference to determine flags needed to 210 compile and link with PCRE, as this will just work when cross-compiling 211 (at least under MXE). 212 213* configure: Define MINGW_HAS_SECURE_API under mingw to get _putenv_s() 214 declared in stdlib.h. 215 216* Enable automake option 'subdir-objects' to avoid warning from newer automake. 217 218portability: 219 220* Avoid doing link tests with libmagic in configure as they fail on mingw due 221 to not automatically picking up libraries which libmagic itself depends on. 222 223Omega 1.2.19 (2014-10-21): 224 225documentation: 226 227* docs/overview.rst: Note that pdftotext is part of poppler as well as xpdf. 228 (Noted by Paul Wise) 229 230Omega 1.2.18 (2014-06-22): 231 232indexers: 233 234* omindex: 235 236 + Work around libmagic returning a MIME content-type of "Composite Document 237 File V2 Document[...]" or "application/CDFV2-corrupt" by returning a more 238 suitable filetype based on looking at the file's extension. 239 240 + The starting URL wasn't previously URL encoded. In 1.3.2, this will be 241 fixed by URL encoding it as we do for the rest of the path, for the 1.2 242 branch we only URL encode it if it contains a character <= 31 or at least 243 one of '#', '%', ':' or '?'. This avoids a one-off reindex of every 244 document in the database in cases which work OK in practice. 245 246 + When we skip a file because it exceeds the configured size limit, include 247 that size limit in the message. 248 249omega: 250 251* Add support for setting the query expansion scheme to use. 252 253portability: 254 255* Don't compile in unixperm.cc - it isn't currently used, and it fails to build 256 with mingw. (fixes #635, reported by Alexis Denis) 257 258* Fix warning when built with GCC 4.7.2 using -Os. 259 260* Removed unused inline function, fixing compiler warning. 261 262Omega 1.2.17 (2014-01-29): 263 264documentation: 265 266* docs/overview.html: Add Abiword as an example use of --filter, based on patch 267 from Frank J Bruzzaniti (fixes#383). 268 269portability: 270 271* Fix "no previous declaration" warning on platforms which don't have 272 mkdtemp(). 273 274Omega 1.2.16 (2013-12-04): 275 276indexers: 277 278* omindex: 279 280 + Fix off-by-one when finding documents to delete which would sometimes cause 281 omindex to fail to delete documents from the database when they weren't 282 refound during an index update. 283 284 + Decode dates in xlsx files. 285 286 + Ignore extensions 'adm', 'cur', and 'ico' by default. 287 288 + Group-readable files which are owner-readable but not world-readable should 289 still get a "readable by owner" term added. Reported by Emmanuel Garette. 290 291build system: 292 293* Compress source tarballs with xz instead of gzip. 294 295* configure: Sync compiler warning flag machinery against xapian-core. The 296 changes are special handling for clang, passing -fshow-column where 297 supported, and handling for new warning flags in GCC 4.6 and 4.7. 298 299Omega 1.2.15 (2013-04-16): 300 301omega: 302 303* Don't pointlessly link utf8convert.o into the omega CGI. 304 305Omega 1.2.14 (2013-03-14): 306 307indexers: 308 309* omindex: 310 311 + Correct "max" -> "min" when reserving space for shared strings in .xlsx 312 files. This just means we now reserve a more appropriate amount of space 313 to start with. 314 315 + Ignore .com files by default. 316 317Omega 1.2.13 (2013-01-09): 318 319indexers: 320 321* omindex: 322 323 + Extracting text using external filters now works for filenames containing a 324 newline character - previously the newline got lost during escaping for the 325 shell. 326 327 + Fix segfault when -F option without a ':' is passed. 328 329 + Skip a file if we get a read error while calculating the MD5 checksum (used 330 for duplicate detection) - previously we used a checksum of the file up to 331 that point. 332 333 + Avoid rereading SVG and Atom files when we calculate their MD5 checksums. 334 335 + Improvement --help output and man page, most notably: 336 337 - Say explicitly that --sample-size accepts the same formats as --max-size. 338 339 - Note default size limit on files to index is unlimited. 340 341 + When generating a sample for a CSV file, limit the size we pre-allocate to 342 the CSV file size if that's smaller than the requested sample size, in case 343 the user sets that limit very high. 344 345omega: 346 347* Fix to decode %-encoded character at the end of the query string. 348 349build system: 350 351* INCLUDES is now deprecated in automake, so use AM_CPPFLAGS instead. 352 353Omega 1.2.12 (2012-06-27): 354 355No changes since 1.2.11 except to bump the version - this release was made to 356fix an incorrect library version information update in xapian-core 1.2.11. 357 358Omega 1.2.11 (2012-06-26): 359 360indexers: 361 362* Change HTML parser's handling of multiple <body> tags and of text outside of 363 <body> to match the behaviour of modern web browsers. (ticket#599) 364 365* omindex: 366 367 + Add command line option to control the size of the document sample stored. 368 Patch from Mihai Bivol. 369 370 + Rework .xlsx parsing to substitute the shared strings into the positions 371 they are used in, so that the sample actually matches what appears in the 372 spreadsheet, and to index calculated cell contents. 373 374 + Improve handling of headers and footers in OpenDocument documents. 375 376 + pdftotext outputs a formfeed between each page, which messes up our "empty 377 body" check, so trim any trailing formfeeds before this check. 378 379build system: 380 381* Don't explicitly link indirect shared library dependencies on FreeBSD, 382 OpenBSD, and Solaris. 383 384Omega 1.2.10 (2012-05-09): 385 386indexers: 387 388* Add support for CDATA to HTML/XML parser. 389 390* omindex: 391 392 + Add --max-size option, based on patch from ndaley in ticket#587. 393 394 + Add support for atom feed files, patch from Mihai Bivol in ticket#595. 395 396 + If the document with the highest existing docid before the run was updated, 397 we were reporting it as "added", but now we correctly report it as 398 "updated". (Backported from 1.3.0). 399 400 + Catch and report std::exception explicitly, so failing to allocate memory 401 is no longer reported as "Unknown exception". (Backported from 1.3.0). 402 403* scriptindex: 404 405portability: 406 407* Fix to build with GCC 4.7 by adding cast to rlim_t to fix error about C++11 408 compatibility (reported by Gaurav Arora). 409 410Omega 1.2.9 (2012-03-08): 411 412documentation: 413 414* docs/overview.html: 415 416 + Document that libmagic is used to determine the MIME type if the extension 417 isn't known. Partly addresses ticket#569. 418 419 + We now limit time as well as CPU and memory for external filters. 420 421indexers: 422 423* Our HTML parser now ignores sections bracketed by <!--UdmComment--> and 424 <!--/UdmComment-->, like we already do for <!--htdig_noindex-->. 425 426* omindex: Add more extensions to the default ignore list: bin dat db fon jar 427 lnk pyc pyd pyo sqlite sqlite3 sqlite-journal tmp ttf 428 429Omega 1.2.8 (2011-12-13): 430 431documentation: 432 433* scriptindex.cc: Add link to http://xapian.org/docs/omega/scriptindex.html to 434 --help output (and so also to the man page which is generated from this). 435 436* omegascript.html: Add note to discourage use of percentage scores. 437 438indexers: 439 440* omindex: 441 442 + If we don't get any data from an external filter for 5 minutes, give up - 443 it has probably ended up blocked indefinitely. 444 445 + Improve --help output (and man page which is generated from it). Closes 446 bug#572. 447 448* scriptindex: 449 450 + If no rules are found in the index script, report an error and give up - 451 this is inevitably the result of a mistake, and adding empty documents to 452 the database isn't helpful. 453 454omega: 455 456 + Add new $prettyurl{} command which undoes RFC3986 URL escaping which 457 doesn't affect semantics in practice. Partly addresses ticket#550. 458 459 + Replace URL decoder with new implementation which handles various corner 460 cases better. Fixes bug#578. 461 462 + If CGI parameter P has trailing spaces, we now remove them all rather than 463 leaving one. 464 465templates: 466 467* templates/query: HTML escape topterms. 468 469* templates/godmode: HTML escape the contents of document values. 470 471* templates/query: Don't show the percentage score in the default template. 472 473testsuite: 474 475* Add new urlenctest unit test of URL encoding and decoding. 476 477portability: 478 479* configure: Sync changes from xapian-core: Don't pass -Wshadow for GCC < 4.1; 480 don't pass -Wstrict-null-sentinel for GCC 4.0.x; only enable symbol 481 visibility on platforms where it is supported. 482 483packaging: 484 485* xapian-omega.spec: Package outlookmsg2html helper. 486 487Omega 1.2.7 (2011-08-10): 488 489documentation: 490 491* docs/termprefixes.html: Document how to map a user prefix to multiple term 492 prefixes. 493 494* docs/overview.html: Improve documentation of htdig_noindex. 495 496omega: 497 498* Improve $version output from "Xapian - xapian-omega 1.2.7" to "xapian-omega 499 1.2.7". 500 501packaging: 502 503* xapian-omega.spec: We're ABI compatible within a release series so make 504 dependency on xapian-core-libs >= rather than =. 505 506Omega 1.2.6 (2011-06-12): 507 508documentation: 509 510* docs/omegascript.html: Correct the documentation of the colours used by 511 $highlight{}. 512 513* docs/overview.html: Add using unoconv as more complex example of using 514 --filter (ticket#324). 515 516templates: 517 518* templates/query: 519 520 + Make search query input type=search. 521 522 + Autofocus the search query input (using HTML autofocus attribute with 523 Javascript fallback for older browsers). (ticket#544) 524 525portability: 526 527* Fix a compiler warning. 528 529Omega 1.2.5 (2011-04-04): 530 531documentation: 532 533* Add index page which links to all the other documentation pages. 534 535* INSTALL: Copy new Multi-Arch section from xapian-core/INSTALL. Replace VPATH 536 section with better equivalent from Xapian-core/INSTALL. 537 538* docs/omegascript.html: Minor improvements. 539 540indexers: 541 542* The HTML parser no longer uses an exception to signify it has finished in 543 the normal case as exceptions are typically costly to handle. In tests, 544 this made omindex ~0.23% faster when indexing a lot of HTML files. 545 546* omindex: 547 548 + Add --ignore-exclusions option, which will index HTML files despite meta 549 robots tags, etc - omindex is often used in environments where such 550 exclusions aren't relevant. 551 552 + Fix to compile with older versions of libmagic which don't have 553 MAGIC_MIME_TYPE (e.g. on Ubuntu hardy). 554 555 + Tell xls2csv to separate fields with spaces rather than commas, and not to 556 quote them. Fixes indexing of numeric fields, and means we don't need to 557 use our CSV parser to get a sample. 558 559 + Add whitespace between chunks of text extracted from Microsoft Office 2007 560 formats to prevent words in adjacent chunks from being run together. 561 562 + Encode reserved characters in URLs - links to files with names containing 563 '#' and '?' now work. 564 565 + Handle .xlr extension the same way as .xls (later Microsoft Works versions 566 apparently produce such files which are really the same format). 567 568 + Index filename extension with new standard prefix E. 569 570 + Just report the mimetype as unknown instead of saying "unknown Office 2007 571 MIME subtype". 572 573 + Ignore *.css and *.js by default too. 574 575 + Messages reporting skipping files are now more consistent and always report 576 the filename. 577 578 + New --empty-docs option to allow documents we extract no body text from to 579 be indexed (existing behaviour), skipped, or reported and then indexed. 580 581omega: 582 583* Fix double Content-Type header in some error reporting situations (regression 584 introduced in 1.2.4). 585 586* Update $url's URL encoding to follow RFC3986. 587 588* Allow QueryParser flags to be set from OmegaScript (ticket#418). The 589 FLAG_SPELLING_CORRECTION flag can now be set using 590 $opt{flag_spelling_correction,1} - the old $opt{spelling,true} way to 591 enable this flag still works, but it now deprecated. 592 593templates: 594 595* templates/emptydocs,templates/godmode,templates/opensearch,templates/query, 596 templates/xml: Add missing escaping. Some of these instances may allow 597 cross-site scripting, so upgrading your templates is recommended, especially 598 if you have any sensitive cookies set on the domain Omega is running on. 599 600* templates/xml: 601 602 + Try $field{caption} (which is what omindex sets) before $field{title} when 603 getting a value for the hit tag's title attribute - this is consistent with 604 how the query template gets the title. 605 606 + Add new 'type' attribute which gives $field{type}. 607 608 + Add 'DBSize' attribute to <result> element. 609 610 + Fix double escaping of matching terms. This is only likely to affect cases 611 where a matching term contains '&'. 612 613 + Remove support for undocumented HILITECLASS CGI variable. There's no 614 evidence I can find using Google code search or web search that this has 615 been used anywhere, and it's difficult to handle escaping it properly in 616 the face of all the ways it could reasonably be used. 617 618portability: 619 620* Fix to compile on Microsoft Windows (ticket#350). 621 622Omega 1.2.4 (2010-12-19): 623 624documentation: 625 626* Minor documentation improvements. 627 628indexers: 629 630* Some iconv implementations (such as that on Mac OS X) don't handle many of 631 the commonly seen mis-punctuated charset names (e.g. UTF16, UTF_16). We now 632 check for this if iconv fails, fix up the charset name, and retry. 633 634* The built-in character encoding converter now handles spaces in charset 635 names. 636 637* Use O_NOATIME if available and either the file is owned by the current euid, 638 or the current euid is 0 (i.e. we're running as root). This avoids updating 639 the access time of files we index which saves time. Fixes ticket#222. 640 641* Report get_description() for Xapian exceptions, which provides additional 642 information above get_msg(). 643 644* Add boolean terms with add_boolean_term() so they get wdf of 0 and don't 645 contribute to document length. 646 647* omindex: 648 649 + Escape wildcard patterns being passed to unzip - in the unlikely event that 650 one of these matched files in or under the current directory, we might fail 651 to extract all the files we wanted to. 652 653 + Add explicit support for indexing CSV files (better samples than from 654 using '-Mcsv:text/plain'). 655 656 + Add support for indexing .msg files from Microsoft Outlook (using the Perl 657 module Email::Outlook::Message. (ticket#334) 658 659 + Improve --help for --mime-type option. 660 661 + Optionally use libmagic to detect MIME types for files for which we have no 662 extension mapping, which allows us to handle files with a misleading 663 extension, or no extension at all. (ticket#114) 664 665 + Add new --filter option which allows the user to specify new filters 666 provided they return UTF-8 text on stdout. 667 668 + If a filter command isn't installed, previously we wouldn't try it again 669 for the same file extension - now we won't try it again for the same 670 mime-type. 671 672 + Index the leafname of the file (without any extension) as extra keywords. 673 674 + Extract author from HTML, OpenDocument, and PDF files. Index it with an A 675 prefix, and add it as a field. 676 677 + Add support for indexing text and metadata from SVG files. 678 679 + Extract metadata from Microsoft Office 2007 file formats. 680 681 + Index text in headers and footers for .odt and .docx files. 682 683 + Use the CSV parser to generate a nicer sample for files of type 684 application/vnd.ms-excel. 685 686 + Add support for indexing Debian and RPM package files (ticket#493). 687 688 + Make the memory limit for filter processes the size of physical memory, 689 which is a little less arbitrary than 7/8 of this value (ticket#424). 690 691 + Under --duplicate=ignore, fix so that old documents which aren't seen get 692 deleted, which wasn't implemented before (to suppress this deletion, pass 693 -p as well). 694 695 + Rename the short option for --version from -v to -V for consistency with 696 scriptindex and many other packages, and to free up -v as the short option 697 for --verbose. For backward compatibility, "omindex -v" is handled 698 specially and still reports the version. 699 700 + Add --verbose option, and disable the less interesting output unless it is 701 specified. 702 703 + Deprecate "--preserve-nonduplicates" in favour of new long option 704 "--no-delete" which does the same thing, but has a clearer name. 705 706 + The deletion of documents pass at the end of indexing is now more 707 efficient. We track how many documents in the database we haven't seen so 708 we can stop once we've found them all (a particularly big improvement if 709 there are no documents to delete), and we now use a PostingIterator over 710 all documents which avoids needing to catch an exception for every gap in 711 the used document ids. 712 713 + Quietly ignore files with mimetype set to "ignore". The initial list of 714 extensions set to ignore is: .a .dll .dylib .exe .lib .o .obj .so 715 716 + Index file owner and read permissions, to allow finding documents with a 717 particular owner, and so searches can be restricted to documents a user is 718 able to read. 719 720 + Add file size as a document value, so you can sort on it and filter by it. 721 722* scriptindex: 723 724 + Fix file descriptor leak if the LOADFILE action is used on something which 725 isn't a file. 726 727omega: 728 729* Make sure we write out HTTP headers when reporting an error early on. 730 731* Extend $field to take an optional DOCID argument, rather than always using 732 the context from $hitlist. 733 734* Add new $emptydocs command which returns a list of documents with doclength 735 zero. 736 737* Add support for size: range filtering. Currently the end points of the range 738 have to be specified in bytes (e.g. size:102400..204800 for 100-200KB). 739 740templates: 741 742* templates/emptydocs: New template which lists documents with doclength zero. 743 744build system: 745 746* configure: Probe for any options needed to enable large file support. 747 Handling files >= 2GB isn't especially useful, but more importantly this is 748 needed to allow omindex to index files on filing systems with 64 bit inodes 749 on some platforms (e.g. 32-bit Linux). 750 751* Use -no-undefined on platforms which need it to dynamically link such as 752 cygwin (need to do this taken from ticket#282). 753 754portability: 755 756* Fix to compile with Sun C++. 757 758Omega 1.2.3 (2010-08-24): 759 760documentation: 761 762* docs/termprefixes.html: Update "flint and quartz" to "flint and chert" as 763 quartz is no longer supported. Give exact term length limit for flint and 764 chert. 765 766packaging: 767 768* xapian-omega.spec: Don't run autoreconf - it's no longer required. 769 770Omega 1.2.2 (2010-06-27): 771 772portability: 773 774* Apply getopt portability fixes from xapian-core 1.2.0, fixing build failures 775 on Mac OS X (and probably some other platforms with non-GNU getopt 776 implementations). (ticket#469) 777 778Omega 1.2.1 (2010-06-22): 779 780This release includes all changes from 1.0.21 which are relevant. 781 782Omega 1.2.0 (2010-04-28): 783 784This release includes all changes from 1.0.20 which are relevant. 785 786build system: 787 788* configure: Tell libtool not to link in deplibs on platforms where we know 789 they aren't needed. 790 791* configure: On Linux, extract the library search path from ldconfig which 792 gives us the default entries reliably. 793 794Omega 1.1.5 (2010-04-15): 795 796This release includes all changes from 1.0.19 which are relevant. 797 798Omega 1.1.4 (2010-02-15): 799 800This release includes all changes from 1.0.18 which are relevant. 801 802omega: 803 804* Use the optimised integer to string conversion routines from xapian-core. 805 806Omega 1.1.3 (2009-11-18): 807 808This release includes all changes from 1.0.15-1.0.17 which are relevant. 809 810templates: 811 812* templates/query: If JavaScript is available, convert $field{modtime} to a 813 string on the client-side so that the timezone is correct. If JavaScript 814 isn't available, fall back to the existing behaviour of using UTC. 815 (ticket#314) 816 817build system: 818 819* configure: Default to looking for xapian-config-1.1 unless XAPIAN_CONFIG is 820 specified. 821 822Omega 1.1.2 (2009-07-23): 823 824This release includes all changes from 1.0.14 which are relevant. 825 826indexers: 827 828* omindex: 829 830 + Handle the "macroenabled" versions of MS Office 2007 files too 831 (ticket#290). 832 833 + Extract pptx notesSlides and comments, if present. (ticket#290). 834 835Omega 1.1.1 (2009-06-09): 836 837This release includes all changes from 1.0.13 which are relevant. 838 839indexers: 840 841* omindex: 842 843 + Check the last modification time of files before reindexing (ticket#342). 844 845 + Add "--spelling" option to index spelling correction data. 846 847* scriptindex: 848 849 + Add new "spell" action for indexing spelling correction data (ticket#296). 850 851omega: 852 853* Add $suggestion and $opt{spelling} to provide access to spelling correction 854 (ticket#296). 855 856* Add $opt{weighting} to allow the weighting scheme and parameters to be 857 specified (ticket#298). 858 859* If SERVER_PROTOCOL in the environment is set to INCLUDED, then our output is 860 being included in another page (e.g. using SSI) so suppress the output of any 861 HTTP headers. 862 863templates: 864 865* templates/query: Offer any spelling correction QueryParser gives. 866 867build system: 868 869* configure: Sync warning flags used with GCC with xapian-core apart from 870 -Woverloaded-virtual which fires for MyHtmlParser::parse_html(). That 871 probably should be tidied up at some point, but not right now. 872 873Omega 1.1.0 (2009-04-23): 874 875indexers: 876 877* scriptindex: 878 879 + Make deprecated "index=nopos" an error. 880 881omega: 882 883* New OmegaScript command $transform{} which performs regular expression 884 substitutions using the PCRE library (which is now required to build Omega). 885 (ticket#231) 886 887build system: 888 889* The build system is now bootstrapped with newer versions of autoconf and 890 libtool which should produce smaller files and speed up configure and 891 make. 892 893Omega 1.0.23 (2011-01-14): 894 895indexers: 896 897* omindex: 898 899 + Escape wildcard patterns being passed to unzip - in the unlikely event that 900 one of these matched files in or under the current directory, we might fail 901 to extract all the files we wanted to when indexing document formats like 902 OpenDocument which use a zip file container. 903 904 + The parser for OpenDocument metadata wasn't initialising its "state" field. 905 Often you'd be lucky and it would be initialised to zero, but this could 906 have caused misparsing of metadata in some cases. 907 908* scriptindex: Fix file descriptor leak if the LOADFILE action is used on 909 something that isn't a file. 910 911* If fstat() fails when trying to load a file, preserve the errno value from 912 the fstat call to report to the user. 913 914portability: 915 916* configure: Probe for any options needed to enable large file support. 917 Handling files >= 2GB isn't especially useful, but more importantly this is 918 needed to allow omindex to index files on filing systems with 64 bit inodes 919 on some platforms (e.g. 32-bit Linux). 920 921* Add -no-undefined to AM_LDFLAGS on platforms which need it to dynamically 922 link such as cygwin (need to do this taken from ticket#282). 923 924Omega 1.0.22 (2010-10-03): 925 926portability: 927 928* Fix to compile with Sun C++. 929 930Omega 1.0.21 (2010-05-18): 931 932portability: 933 934* Fix build failure in freemem.cc on Microsoft Windows. 935 936Omega 1.0.20 (2010-04-27): 937 938portability: 939 940* Fix build failure on Mac OS X and possibly some other platforms (regression 941 caused by fix for getopt-related warnings on Cygwin in 1.0.19). 942 943Omega 1.0.19 (2010-04-15): 944 945portability: 946 947* Fix getopt-related warning on Cygwin. 948 949Omega 1.0.18 (2010-02-14): 950 951indexers: 952 953* Make the default charset "utf-8" not "UTF-8" as we lower case explicitly 954 specified character sets to compare to see if we need to reparse. Previously 955 XML documents which explicitly specified their character set as UTF-8 would 956 cause needless restart or the parser. 957 958* omindex: 959 960 + Increase the wdf boost for the document title from 2 to 5, since 2 isn't 961 really enough. 962 963* scriptindex: 964 965 + Don't abort with "Unknown Exception" if indexing is disallowed or we hit 966 </body> for a document which had an overridden character set. Fixes 967 ticket#410. 968 969Omega 1.0.17 (2009-11-18): 970 971indexers: 972 973* omindex: 974 975 + On Linux, change the memory limit on external filters to use _SC_PHYS_PAGES 976 since _SC_AVPHYS_PAGES excludes pages used by the OS cache and so will 977 often report a really low value. Fixes Debian bug#548987 and ticket#358. 978 979 + Fix likely crash when reading output from external filter program if read() 980 is interrupted by a signal. 981 982 + Fix potential crash when indexing PostScript files (fixed by using delete[] 983 (not delete) for array allocated by new[]). 984 985testsuite: 986 987* utf8converttest: Charset "8859_1" isn't understood by Solaris libiconv, and 988 isn't a standard charset name, so just test it when using our built-in 989 converter and GNU libc. 990 991portability: 992 993* Fix build failure on Mac OS X 10.6. 994 995* Also check for socketpair() in -lxnet if it isn't found without, which 996 enables resource limits on external filter programs called by omindex on 997 Solaris, and possibly some other platforms. Fixes ticket#412. 998 999Omega 1.0.16 (2009-09-10): 1000 1001* omega: Fix cross-site scripting vulnerability in reporting of exceptions 1002 (CVE-2009-2947). 1003 1004Omega 1.0.15 (2009-08-26): 1005 1006general: 1007 1008* omegascript.vim: The list of OmegaScript commands in the vim mode was rather 1009 out of date, and a few commands were misclassified. Fix both problems and 1010 avoid future recurrences by automatically generating those lists from the 1011 command list in query.cc. 1012 1013documentation: 1014 1015* omegascript.html: Document that $date uses UTC. (ticket#314) 1016 1017templates: 1018 1019* query: Link to "xapian.org" rather than "www.xapian.org". 1020 1021* inc/toptermsjs: Use double-quotes rather than single quotes for parameter 1022 values on the <script> tag. 1023 1024portability: 1025 1026* omindex: Implement correct handling of paths when calling external filter 1027 programs on Microsoft Windows. 1028 1029Omega 1.0.14 (2009-07-21): 1030 1031indexers: 1032 1033* omindex: Make sure that output is flushed after every message, not just after 1034 some of them. 1035 1036portability: 1037 1038* Avoid infinite loop in omindex and scriptindex when reading files under 1039 Cygwin with automatic end of line translation enabled. This same bug can 1040 also manifest on Unix platforms if the file is truncated by another process 1041 while being read. 1042 1043Omega 1.0.13 (2009-05-23): 1044 1045indexers: 1046 1047* omindex: 1048 1049 + If the filter program needed for a file format isn't installed, report this 1050 explicitly when skipping subsequent files with the extension instead of 1051 misleadingly reporting "Unknown extension". 1052 1053 + Make -s actually work as a short-form for --stemmer (as documented by 1054 "omindex --help" and "man omindex"). 1055 1056 + Drop the copyright info from the output of --version as it's perennially 1057 out of date and we don't report it for any other Xapian programs. 1058 1059* scriptindex: 1060 1061 + Add new "valuenumeric" action to add a document value using 1062 Xapian::sortable_serialise() to allow numeric sorting (ticket#260). 1063 1064build system: 1065 1066* configure: Enable more GCC warnings - "-Wstrict-null-sentinel" for 4.0+, 1067 "-Wlogical-op -Wmissing-declarations" for 4.3+. 1068 1069Omega 1.0.12 (2009-04-19): 1070 1071omega: 1072 1073* $log now retries a partial write, or one interrupted by a system call. 1074 1075build system: 1076 1077* configure: Fix iconv parameter type probe not to implicitly cast a string 1078 literal to char* - this a warning under GCC currently, but the user could 1079 pass -Werror explicitly in CXXFLAGS, and this could be promoted to an error 1080 in future GCC versions, and may already be so for some other compilers. 1081 1082* Overriding CXXFLAGS at make-time (e.g. "make CXXFLAGS=-Os") no longer 1083 overrides any flags required for building with Xapian. 1084 1085* We now actually use the compiler warning flags which configure detects. 1086 1087Omega 1.0.11 (2009-03-15): 1088 1089documentation: 1090 1091* cgiparams.html: Note the technique of using a stub database file to allow a 1092 default of searching over multiple databases. 1093 1094indexers: 1095 1096* omindex: 1097 1098 + Add support for indexing Microsoft Office 2007 formats and XPS files 1099 (bug#290). 1100 1101 + Fix the extraction of metadata from OpenDocument formats. 1102 1103 + Fix "-l" which would previously always cause a segmentation fault if used 1104 ("--depth-limit" wasn't affected). 1105 1106build system: 1107 1108* configure: The output of g++ --version changed format (again) with GCC 4.3 1109 which meant configure got "g++" for the version. Instead use the (hopefully) 1110 more robust technique of using g++ -E to pull out __GNUC__ and 1111 __GNUC_MINOR__. 1112 1113* configure: Turn on _FORTIFY_SOURCE where available (as we do in xapian-core). 1114 1115portability: 1116 1117* Fix to compile when RLIMIT_AS isn't available (as on NetBSD and OpenBSD). 1118 Instead use RLIMIT_VMEM or RLIMIT_DATA if either is available, else don't try 1119 to limit the memory the filter process can use. 1120 1121Omega 1.0.10 (2008-12-23): 1122 1123build system: 1124 1125* This release now uses newer versions of the autotools (autoconf 2.62 -> 1126 2.63; automake 1.10.1 -> 1.10.2). The newer autoconf fixes a regression 1127 in autoconf 2.62 (and so Omega 1.0.7) with detecting the endian-ness of some 1128 platforms. 1129 1130Omega 1.0.9 (2008-10-31): 1131 1132documentation: 1133 1134* docs/overview.html: Document HTML parsing a bit, including robots 1135 meta and htdig_noindex. 1136 1137omega: 1138 1139* omega: Catch std::exception and report what its what() method returns. 1140 1141* omega: Remove undocumented and non-functional support for numeric sorting 1142 via CGI parameter SORT=#<slot> (SORT=<slot> works as before). 1143 1144build system: 1145 1146* configure: Sync warning flag handling changes from xapian-core to eliminate 1147 many warnings from GCC 4.3. 1148 1149Omega 1.0.8 (2008-09-04): 1150 1151documentation: 1152 1153* Fix a few typos and improve wording in a few places. 1154 1155indexers: 1156 1157* omindex: 1158 1159 + If the character encoding is specified using <meta http-equiv=...> in an 1160 HTML document then reparse the document if it isn't the encoding we're 1161 already using so that any preceding <title> is converted correctly 1162 (bug#292). 1163 1164 + Convert text from meta tag parameters to UTF-8 (bug#293). 1165 1166 + Handle <meta charset="..."> (new in HTML 5). 1167 1168 + Fix bug in HTML tag parameter parsing which was probably just a small 1169 performance penalty in real world cases, but could perhaps result in 1170 parsing bogus extra parameters in carefully contrived situations. 1171 1172portability: 1173 1174* Add missing <signal.h>, noted on FreeBSD by Henrik Brix Andersen. 1175 1176Omega 1.0.7 (2008-07-14): 1177 1178documentation: 1179 1180* omegascript.html,scriptindex.html: Fix empty titles. 1181 1182indexers: 1183 1184* omindex: 1185 1186 + When indexing text files, handle UCS-2 and UTF-16 text files with a 1187 byte-order mark (BOM), and ignore any UTF-8 "byte-order" mark. 1188 1189 + The built-in conversion code (used when iconv isn't available) now handles 1190 UCS-2/UTF-16 with and without a BOM, and also the explicit BE and LE forms. 1191 1192omega: 1193 1194* Overhaul the $highlight colour combinations since some were rather 1195 unreadable (Debian bug 484456). 1196 1197build system: 1198 1199* configure: Synchronise code for working out warning flags used for builds 1200 with that used for xapian-core, which in particular handles different 1201 output formats from "gcc --version". 1202 1203portability: 1204 1205* configure: Fix header checks to pre-include <sys/types.h> which Mac OS X 1206 needs for some other headers to work. 1207 1208* configure: Fix probing for iconv to work better when iconv isn't found 1209 (previously this only worked on Mac OS X with fink). 1210 1211* Fix compilation error on FreeBSD, introduced in 1.0.5. 1212 1213* In omega, cast size to unsigned before division to avoid a warning about 1214 signed overflow. 1215 1216packaging: 1217 1218* xapian-omega.spec: Remove "www." from xapian.org and oligarchy.co.uk URLs. 1219 1220Omega 1.0.6 (2008-03-17): 1221 1222documentation: 1223 1224* docs/omegascript.html: Improve formatting. 1225 1226indexers: 1227 1228* omindex: 1229 1230 + Add support for DjVu files. 1231 1232 + If we get an error trying to read a directory entry, report it to the user 1233 rather than ignoring it. 1234 1235omega: 1236 1237* New OmegaScript commands $addfilter, $lower, $upper. 1238 1239portability: 1240 1241* Check "defined HAVE_SYSMP" rather than just "HAVE_SYSMP". This doesn't 1242 change behaviour, but fixes a compile warning on platforms other than Linux 1243 and IRIX. 1244 1245Omega 1.0.5 (2007-12-21): 1246 1247documentation: 1248 1249* Convert .txt docs to reStructedText which we process to produce HTML. 1250 1251* Add a note inviting suggestions for additional reliable filter programs. 1252 1253* overview.html: omindex hasn't generated "W"-prefix terms since 0.9.7, so 1254 remove the documentation saying it does. 1255 1256indexers: 1257 1258* omindex: 1259 1260 + If a file's extension isn't found in the mime_map and contains uppercase 1261 ASCII characters, check for the lower cased extension (so .PDF and .Pdf 1262 behave the same way as .pdf, unless you deliberately add different mappings 1263 for them). 1264 1265 + '-f' is documented by --help as a short option for '--follow', but wasn't 1266 previously actually recognised. 1267 1268 + Limit filter programs to 7/8 of free physical memory on platforms where we 1269 know how to determine this statistic (currently at least Linux, FreeBSD, 1270 IRIX, HP-UX; probably Solaris and a few others too). This helps to prevent 1271 runaway filters from causing a denial of service (bug#111). 1272 1273 + Avoid rereading uncompressed AbiWord documents in order to calculate their 1274 MD5 checksums. 1275 1276* scriptindex: 1277 1278 + Now inserts a ':' between prefix and term, using the same criteria which 1279 Xapian::QueryParser does. 1280 1281 + The 'BOOLEAN' action now ignores an empty input rather than adding just the 1282 prefix as a term. 1283 1284 + The 'UNIQUE' action now issues a warning for empty input but otherwise 1285 ignores it. 1286 1287portability: 1288 1289* Add explicit includes of C headers needed to build with the latest snapshots 1290 of GCC 4.3. 1291 1292Omega 1.0.4 (2007-10-30): 1293 1294omega: 1295 1296* If an OmegaScript template specifies the same field name as both a boolean 1297 and a probabilistic term prefix then previous the boolean setting would 1298 be ignored (e.g. $setmap{prefix,foo,A}$setmap{boolprefix,foo,H}). Now this 1299 generates an error. If you set prefixes in your templates, you may wish to 1300 check them over before upgrading. 1301 1302Omega 1.0.3 (2007-09-28): 1303 1304general: 1305 1306* Distribution tarballs are now in the POSIX "ustar" format since it saves 1307 a few KB and we need to use it for xapian-core anyway. 1308 1309documentation: 1310 1311* Expand the output of 'mbox2omega --help' and refer the reader to it from 1312 docs/scriptindex.txt. 1313 1314indexers: 1315 1316* omindex: 1317 1318 + Add support for indexing AbiWord documents and TeX DVI files. 1319 1320 + Impose a 5 minute CPU time limit on filter programs to prevent problems if 1321 a filter program goes into an infinite loop on a malformed input. Partly 1322 addresses bug#111. 1323 1324* scriptindex: 1325 1326 + Fix line number tracking in dump files. 1327 1328omega: 1329 1330* Add $muldiv{A,B,C} which calculates int(A*B/C). 1331 1332* Fix bug in decimal fraction in $size for files >= 1M in size. 1333 1334templates: 1335 1336* query: 1337 1338 + Set HTML charset to utf-8 since that's what databases now are by default. 1339 1340 + Restyle to use CSS to draw a "score bar" instead of using images. 1341 1342 + Rework the layout of each hit. 1343 1344 + Add popup hints on mouse-over for various items. 1345 1346 + Tidy up some HTML gremlins. 1347 1348Omega 1.0.2 (2007-07-05): 1349 1350documentation: 1351 1352* scriptindex.txt: Fix typo. 1353 1354indexers: 1355 1356* omindex: 1357 1358 + If --url isn't passed, default to "/", but print a warning noting that this 1359 default has been used (at least for now). 1360 1361 + Report files that aren't indexed because their extensions aren't 1362 recognised. 1363 1364build system: 1365 1366* Value of XAPIAN_CONFIG supplied to configure is now passed to distcheck, 1367 to ensure that it works with uninstalled copies of Xapian. 1368 1369portability: 1370 1371* Fix test programs to build with a development snapshot of GCC 4.3. 1372 1373Omega 1.0.1 (2007-06-11): 1374 1375documentation: 1376 1377* overview.txt: As of 1.0.0, we no longer use pstotext for PostScript, but 1378 instead use ps2pdf followed by pdftotext (since this works for Unicode). 1379 1380* scriptindex.txt: Document that you can delete a document by supplying a new 1381 document which only contains the unique term. 1382 1383indexers: 1384 1385* Fix bug in HTML parser - if the text between two tags consisted entirely of 1386 whitespace it would just be ignored which could run words together if 1387 the tags didn't produce implicit whitespace. This bug dates back to at least 1388 Omega 0.8.2. 1389 1390* omindex: Under Linux (and probably some other platforms) struct dirent can 1391 tell us the type of a directory entry for some filing systems, so make use of 1392 this to avoid calling stat() (or lstat()) unnecessarily - when indexing 1393 /usr/share/doc on my Linux box, this saves about 14000 explicit calls to 1394 stat() (leaving about 7000). 1395 1396omega: 1397 1398* Fix handling of query parsing errors (broken by changes in 1.0.0). 1399 1400packaging: 1401 1402* The required automake version has been lowered to 1.8.3, so RPMs can now be 1403 built on RHEL 4 and SLES 9. 1404 1405Omega 1.0.0 (2007-05-17): 1406 1407general: 1408 1409* Omega and the indexers now work in UTF-8. If iconv() is available, omindex 1410 will use it to convert documents from other formats, otherwise it has 1411 built-in support for UTF-8 and ISO-8859-1; omindex knows how to run the 1412 various external filter programs to generate UTF-8 output; scriptindex 1413 assumes input is already in UTF-8. 1414 1415* Change the project name (used to name tarballs, and default installation 1416 paths) to "xapian-omega" since that's what the RPMs and Debian packages 1417 already use (there's a Rogue-like game called Omega). 1418 1419documentation: 1420 1421* docs/overview.txt: Document what each of the OmegaScript templates does. 1422 1423* docs/quickstart.txt: Assorted minor improvements. 1424 1425* docs/termprefixes.txt: Document new 'Z' prefix, and that the 'R' and 'W' 1426 prefixes are no longer used by Xapian. 1427 1428* docs/cgiparams.txt: FMT isn't limited to just `a-z' - the actual restriction 1429 is that it may not contain `..'. 1430 1431* docs/scriptindex.txt: Explicitly note that index=nopos is deprecated 1432 (scriptindex already emits a warning). 1433 1434* NEWS: Add note that Omega < 0.8.0 NEWS entries are in the xapian-core NEWS 1435 file. 1436 1437* TODO: Updated. 1438 1439indexers: 1440 1441* Updated to use the new Xapian::TermGenerator class. This means that the 1442 indexing strategy has changed. 1443 1444* "--help" now reports the default stemming language (i.e. "english"). 1445 1446* Implement new sample generating function which normalises all runs of 1447 whitespace to a single space, and fixes invalid UTF-8 in the sample. 1448 1449* omindex: 1450 1451 + We now index PostScript by converting to PDF with ps2pdf and then indexing 1452 that. This allows us to index PostScript files containing Unicode 1453 characters outside of ISO-8859-1, and also means we now get metadata from 1454 PostScript files. The downside is it is quite a bit slower. 1455 1456 + Add support for indexing MS Works documents using wps2text (part of 1457 libwps). 1458 1459 + Don't index empty files. 1460 1461* scriptindex: 1462 1463 + Fix optimisation of "load truncate=N" to actually work! 1464 1465 + The "truncate" action knows not to chop off a multibyte UTF-8 character. 1466 1467 + Update short option list for scriptindex to match documented usage (-h, -V 1468 and -s were not working). 1469 1470 + Remove -q and -u options - they no longer do anything and are only accepted 1471 for compatibility with really old versions (0.6.1 and earlier for -q; 0.7.5 1472 and earlier for -u). 1473 1474omega: 1475 1476* Add an alternative implementation of date range filtering which uses a 1477 MatchDecider. This allows everything that the existing implementation does, 1478 plus you can support sorting on a choice of dates (e.g. first published or 1479 last updated), and filtering works to a resolution of a minute rather than a 1480 day. Set CGI parameter DATEVALUE to enable this, and to specify the value to 1481 use. Since omindex now adds the last modified date as value 0, this will 1482 work with omindex. 1483 1484* Enhance $substr{} to accept a negative length (meaning to count back from the 1485 end of the string). 1486 1487* New CGI parameters to allow finer control of sorting and ranking - SORTAFTER 1488 and DOCIDORDER. 1489 1490* The sorting options are now encoded in $filters so Omega can automatically 1491 reset to page 1 if they are changed. 1492 1493* Add new OmegaScript $weight command which returns the raw document weight - 1494 mostly useful for debugging purposes. 1495 1496* $topterms{} now generates unstemmed terms. 1497 1498* $prettyterm{TERM} has been updated to fit with changes to the term generation 1499 strategy. 1500 1501* Add 'you' and 'your' as stopwords. 1502 1503* $filesize{SIZE} enhanced to return a decimal point for K, M, and G (e.g. 1504 "2.1K" and "4.0M" rather than "2K" and "4M"); $filesize{0} is now "0 bytes"; 1505 $filesize{1} is now "1 byte"; $filesize{SIZE} where SIZE is negative is now 1506 "". 1507 1508* Remove $freqs as it has been deprecated for ages. 1509 1510* Remove support for xB, xDATE1, xDATE2, xDAYSMINUS, and xDEFAULTOP which were 1511 deprecated in favour of xFILTER in 0.7.5 (over 3 years ago). 1512 1513* Remove deprecated aliases for CGI parameters (deprecated in 0.6.3 or 0.6.5, 1514 more than 3.5 years ago): RAW_SEARCH (now RAWSEARCH), DATE1 (now START), 1515 DATE2 (now END), DAYSMINUS (now SPAN but with slightly different semantics), 1516 and MIN_HITS (now MINHITS). 1517 1518* Remove "bias_weight" and "bias_halflife" CGI parameters since they rely on 1519 Enquire::set_bias() which has been removed. 1520 1521templates: 1522 1523* The 'query' template no longer uses $topterms by default. 1524 1525* New 'topterms' template provides a query template with $topterms support. 1526 1527* Template fragments which aren't intended for direct use have been moved to 1528 an "inc" subdirectory. 1529 1530testsuite: 1531 1532* md5test: Add tests for MD5 code. 1533 1534build system: 1535 1536* `./configure --enable-quiet' already allows you to specify at configure time 1537 to pass `--quiet' to libtool. Now you can override this at make-time by 1538 using `make QUIET=' (to turn off `--quiet') or `make QUIET=y' (to turn on 1539 `--quiet'). 1540 1541* configure: Disable probes for f77, gcj, and rc completely by preventing 1542 the probe code from even appearing in configure - this reduces the size of 1543 configure by 29% and should speed it up significantly. 1544 1545portability: 1546 1547* Fixed to build with GCC 4.3 snapshot. 1548 1549* We now make use of the safe*.h portability headers from xapian-core. 1550 1551* Ensure that the result of snprintf is zero terminated since MSVC's snprintf 1552 is broken (by design it seems). 1553 1554* configure: xapian-config --cxxflags now includes -ptused for SGI's C++ 1555 compiler, so we don't need to probe for it here. 1556 1557* configure: Perform a link test for posix_fadvise to fix misdetection on 1558 HP-UX. 1559 1560Omega 0.9.10 (2007-03-04): 1561 1562documentation: 1563 1564* docs/omegascript.txt: Rewrite introductory paragraph. Note that 1565 whitespace is significant, and add explicit warning to $setmap. 1566 1567* docs/termprefixes.txt: Expand section on boolean prefixes, showing 1568 how to generate them using scriptindex, and how to allow them to be 1569 selected in an HTML form. 1570 1571indexers: 1572 1573* omindex: Generate correct MD5 checksums on big-endian platforms. 1574 1575omega: 1576 1577* Fix $substr{} with negative start to actually work. 1578 1579* Fix $substr{} to never cause a C++ exception. 1580 1581packaging: 1582 1583* omega.spec.in: Remove "." from the end of the Summary. 1584 1585Omega 0.9.9 (2006-11-09): 1586 1587documentation: 1588 1589* Ship our custom INSTALL file rather than the generic one from autoconf which 1590 we've accidentally been shipping instead since 0.9.5. 1591 1592indexers: 1593 1594* scriptindex: The "date" action no longer modifies the value it operates on 1595 (it was never meant to!) 1596 1597omega: 1598 1599* Report an error if $setmap is called with an even number of parameters. 1600 An incorrect example in the documentation used to suggest this, so it's 1601 particularly useful to catch this case. 1602 1603packaging: 1604 1605* RPMs: Prevent binaries getting an rpath for /usr/lib64 on FC6. 1606 1607Omega 0.9.8 (2006-11-02): 1608 1609omega: 1610 1611* $substr where the start is negative and longer than the string (e.g. 1612 $substr{abcd,-5,1}) wasn't working as intended. 1613 1614build system: 1615 1616* configure: Tell AC_CHECK_HEADERS to suppress its backward compatibility mode, 1617 so it only checks headers with the compiler. This speeds up configure a 1618 little, and is what we do elsewhere. 1619 1620* configure: Warning flags for GCC weren't actually getting used. Fix this to 1621 work and use the same warning flags for GCC and Intel C++ as xapian-core does. 1622 Fix all the warnings this uncovered! 1623 1624* omega,omindex,scriptindex: Remove some old unused code. 1625 1626portability: 1627 1628* Ensure that we always pass an unsigned char value to isupper(), toupper(), 1629 etc as they are undefined on other values (glibc makes them work for signed 1630 char values too, but this is an extension). 1631 1632* configure: Pass magic options to SGI's C++ compiler to allow linking of 1633 templates to work. 1634 1635* configure: IRIX doesn't allow stdint.h to be included from C++ so we need 1636 a smarter configure test than AC_CHECK_HEADERS. 1637 1638* Fix warnings from SGI's C++ compiler. 1639 1640Omega 0.9.7 (2006-10-10): 1641 1642documentation: 1643 1644* omegascript.txt: Note that (by design) an omegascript template can't 1645 contain an infinite loop. 1646 1647* termprefixes.txt: "$setmap{title,S}" should be "$setmap{prefix,title,S}". 1648 1649* Use the default paths to the database directories and the omega CGI binary in 1650 examples. 1651 1652* README: Update reference to "CVS" to say "SVN". 1653 1654indexers: 1655 1656* Don't get confused by "a<b" in Javascript in a <script> tag. Fixes bug#91. 1657 1658* Support htdig's "ignore this bit" comments. 1659 1660* Don't generate terms with more than 3 trailing symbols ('-', '+', or '#'). 1661 1662* omindex: 1663 1664 + Add the file last modified time as value #0. 1665 1666 + Generate an MD5 checksum of each file indexed and store it in value #1 1667 to allow duplicates to be collapsed. 1668 1669 + Store the file's last modified time in the document data as "modtime" so it 1670 shows up in search results (and tweak the query template so the display of 1671 this information looks nicer). Don't add "modtime" field if the timestamp 1672 is (time_t)-1. 1673 1674 + Run pdfinfo once and pull out the fields we want using string operations, 1675 instead of running it twice filtered through sed. 1676 1677 + Parse the XML from OpenDocument and OpenOffice using new subclasses of 1678 HtmlParser. Only extract meta.xml once. 1679 1680 + Add "size" field to document data. 1681 1682 + Run xls2csv on MS Excel files, run catppt on MS Powerpoint files, and also 1683 index MS Word templates (.dot) the same way as .doc files. 1684 1685 + Don't generate 'W' terms since omega doesn't use them. 1686 1687 + If a filter program isn't installed, then don't try it again for the same 1688 extension (not perfect but an improvement - previously we indexed an empty 1689 document!) 1690 1691 + If popen() fails, treat it as a read error. 1692 1693* scriptindex: 1694 1695 + Add new "load" action to allow the contents of an external file to be 1696 loaded and parsed. 1697 1698 + Fix check for whether a record has content in the case where the same field 1699 is processed more than once. 1700 1701omega: 1702 1703* Add $pack and $unpack OmegaScript commands to allow big endian binary values 1704 to be encoded and decoded (for use with omindex's lastmod in value #1). 1705 1706* omega.conf: Fix code which reads omega.conf to be line based as documented 1707 rather than the wacky whitespace based scheme that was actually implemented. 1708 Also we now allow "#" comments and blank lines in omega.conf. 1709 1710* Fix $highlight{} to work with capitalised words (it used to work but 1711 regressed in 0.8.2). 1712 1713* Use '\t' to separate terms in xP since filter terms might contain '.'. Fixes 1714 bug#87. 1715 1716testsuite: 1717 1718* Add htmlparsetest which tests the MyHtmlParser class. 1719 1720build system: 1721 1722* Makefile.am: Make use of the dist_ prefix to avoid having to list files in 1723 EXTRA_DIST as well as in *_SCRIPTS, *_DATA, and man_MANS. 1724 1725* Makefile.am: Prefer $(sysconfdir) to @sysconfdir@ since the former can be 1726 overridden on the "make" command line. 1727 1728portability: 1729 1730* xapian-config will now switch Sun's C++ compiler into ANSI C++ compliant 1731 mode, so remove all the special case bits of code added for just this one 1732 compiler. 1733 1734* omindex: Fix escaping of filenames to cast characters to "unsigned char" so 1735 that isalnum() works correctly everywhere. Not a security hole as dangerous 1736 characters were still being escaped. 1737 1738* Call pclose() not fclose() on a FILE* obtained from popen(). This bug could 1739 cause us to run out of file descriptors on some platforms. 1740 1741* configure: Check for strftime. 1742 1743packaging: 1744 1745* omega.spec.in: Include documentation in the RPM package. 1746 1747Omega 0.9.6 (2006-05-15): 1748 1749documentation: 1750 1751* docs/omegascript.txt: Clarified description of $now. 1752 1753indexers: 1754 1755* scriptindex: Fix "index" and "indexnopos" without a prefix to set the weight 1756 correctly (bug introduced in 0.9.5). 1757 1758omega: 1759 1760* Added new OmegaScript commands $filterterms and $substr. 1761 1762portability: 1763 1764* configure: Update snprintf detection to match xapian-core. 1765 1766* Fix MSVC warnings. 1767 1768packaging: 1769 1770* omega.spec.in: Create and package /var/lib/omega/cdb and /var/log/omega. 1771 1772Omega 0.9.5 (2006-04-08): 1773 1774documentation: 1775 1776* README: Add pointer to documentation. 1777 1778* Added man pages for omindex and scriptindex, generated using help2man. 1779 1780indexers: 1781 1782* scriptindex: 1783 1784 + If we fail to open the index script, die with an error (previously we 1785 acted as if an empty file was specified). 1786 1787 + Warn about a useless "weight" action, even if it's followed by another 1788 non-useless action (e.g. "field") - previously we only warned if it 1789 was last or followed only by other useless actions. 1790 1791 + Warn if "unique=<prefix>" is used without a corresponding 1792 "boolean=<prefix>" on the same line. 1793 1794 + Warn that "index=nopos" is deprecated and should be replaced by 1795 "indexnopos". 1796 1797 + Add explanatory text "(note that actions are executed from left to right)" 1798 when reporting useless actions. 1799 1800 + Added new "hash" command to allow hashed terms to be generated from long 1801 URLs like omindex does. 1802 1803* htdig2omega.script,mbox2omega.script: Make use of the new scriptindex "hash" 1804 command. 1805 1806* dbi2omega: Check DBIDRIVER environmental variable to allow a driver other 1807 than mysql to be specified without modifying the script. 1808 1809omega: 1810 1811* Fix $opt[fieldnames] handling. Previously it would try to kick in if you 1812 didn't set fieldnames but set any alphabetically later option! The symptom 1813 was that $field{} would stop working (bug#72). 1814 1815portability: 1816 1817* omindex,omega: Tweaks for MSVC compilation. 1818 1819Omega 0.9.4 (2006-02-21): 1820 1821documentation: 1822 1823* COPYING: Updated FSF address. 1824 1825Omega 0.9.3 (2006-02-16): 1826 1827documentation: 1828 1829* overview.txt: The U prefix (URL term) was grouped with the date searching 1830 prefixes, but it makes more sense to group it with the prefixes relating to 1831 parts of the URL (H for hostname, P for path, etc). 1832 1833* overview.txt: Add pointer to documentation of the supported query syntax. 1834 1835* omegascript.txt: Improve descriptions of $cgi, $collapsed, $value, $version. 1836 1837* termprefixes.txt: Fix typo. 1838 1839indexers: 1840 1841* omindex: add --preserve-nonduplicates / -p option to not delete any documents 1842 that aren't updated, in replace duplicates mode (so that multiple runs of 1843 omindex on different subsites don't stomp on each other). 1844 1845* omindex,scriptindex: Add "--stemmer" option to omindex and scriptindex 1846 to allow the stemming language to be set. Fixes bug#11. 1847 1848* omindex,scriptindex: More consistent --help and --version output. 1849 1850* omindex: Add support for OpenDocument format mimetypes and extensions out of 1851 the box. Previously you could index them but had to pass a "-m" option for 1852 each OpenDocument filename extension you wanted to handle. 1853 1854* scriptindex: The "-q" option no longer actually controls anything. Just 1855 ignore it for backwards compatibility (and don't document it in --help). 1856 1857omega: 1858 1859* If executing an OmegaScript command causes a Xapian exception to be thrown, 1860 catch it and copy the error message into error_msg (which is read by the 1861 $error command). This allows such errors to reported in a nicer way. 1862 1863* Added "SORTREVERSE" CGI parameter which allows the sort order to be reversed 1864 when sorting on a value. Removed "SORTBANDS" CGI parameter since it no 1865 longer does anything. 1866 1867* Added $find{LIST,STRING} to return the subscript of the first occurrence of 1868 string STRING in list LIST. 1869 1870* Added $lookup{CDBFILE,KEY} OmegaScript command to perform a lookup in a CDB 1871 file. 1872 1873* Added new feature which allows you to avoid storing fieldnames in every 1874 document. Instead you just store the field values, one per line, and add 1875 something like "$set{fieldnames,$split{caption sample url}}" to the 1876 OmegaScript template to specify the fieldnames to use. This can save a lot 1877 of disk space for a large database. 1878 1879* Add new "$split{}" OmegaScript command which splits a string to give an 1880 OmegaScript list. 1881 1882* Fix $url{} to escape "+" to "%2b". Also fix encoding of top-bit-set 1883 characters on platforms where char is signed by default. 1884 1885* Speed up $highlight{} - only compare terms which are the same length. 1886 1887* Reduce memory usage if a lot of documents are marked as relevant. 1888 1889templates: 1890 1891* query: Make the page title shorter so there's more chance it will fit on icon 1892 bars, etc. 1893 1894* opensearch: Add missing escaping. 1895 1896* godmode: If a non-existent docid is specified, report the error and prompt 1897 the user to enter another docid. Fixes bug#60. 1898 1899portability: 1900 1901* omega: Fix printf type mismatch on 64 bit platforms. 1902 1903* omega: Cast time_t to unsigned long to avoid problems on 64bit platforms. 1904 1905* Use snprintf where available. 1906 1907* Write top-bit set characters using \xXX notation to avoid warnings from 1908 Intel's C++ compiler. 1909 1910Omega 0.9.2 (2005-07-15): 1911 1912* omega: Changed $highlight so if OPEN and CLOSE aren't specified, they default 1913 to highlighting each word from the query with a different background colour 1914 like gmane does (previous default was to use '<strong>' and '</strong>'). 1915 1916* omega: Call QueryParser::set_database() as this is now used to decide what to 1917 do for terms like "C#". 1918 1919* omega: Added the ability to set boolean prefixes for the QueryParser by 1920 setting a "boolprefix" map in the omegascript template. 1921 1922* omega: Added $length{} and $stoplist{} commands to OmegaScript. 1923 1924* scriptindex: Fix infinite loop if there's no newline at the end of a dumpfile. 1925 1926* docs/termprefixes.txt: Explain how to use termprefixes with scriptindex and 1927 omega, since that's what most people will want to know. 1928 1929* docs/omegascript.txt: Use standard "S" prefix for title in example for 1930 $setmap, rather than "XT". 1931 1932Omega 0.9.1 (2005-06-06): 1933 1934* Releases are now created using libtool 1.5.18 and automake 1.9.5. 1935 1936* Updated RPM packaging. 1937 1938Omega 0.9.0 (2005-05-13): 1939 1940* Updated for 0.9.0 API changes. 1941 1942* omindex/scriptindex: Generate terms like "c#". 1943 1944* Added mbox2omega script which allows a mail folder to be indexed using 1945 scriptindex. Mostly it's an example as there's no mechanism included to show 1946 the full original message. 1947 1948omega: 1949 1950* The configuration file is now looked for differently - you can now set 1951 the environmental variable OMEGA_CONFIG_FILE. See docs/overview.txt for 1952 details. 1953 1954* $highlight can now highlight terms like "C#". 1955 1956* Add new template 'opensearch' to implement basic opensearch feeds of search 1957 results. 1958 1959omindex: 1960 1961* URL hashing previously depended on sizeof(long) so databases weren't totally 1962 portable between platforms. This is now fixed, but to do so we've had to 1963 break compatibility with databases built on platforms with 64 bit longs 1964 with URLs > 228 bytes. 1965 1966* Removed useless "DUPE_duplicate" option. 1967 1968* Added support for indexing Perl "pod" documentation using pod2text. 1969 1970* Replaced -l/--no-recurse with -l/--depth-limit which takes an argument 1971 allowing recursion to be restriction to any depth, not just 0 or infinity! 1972 1973* Extend -M/--mime-type to allow an existing mapping to be removed by omitting 1974 the type. 1975 1976* Fixed code so that we get lstat() prototype on Linux systems where we have 1977 posix_fadvise(). 1978 1979scriptindex: 1980 1981* Improved handling of extra blank lines in dump file. 1982 1983* Strip multiple \r characters from end of line. 1984 1985* Complain if a dump file doesn't appear to have been = escaped correctly. 1986 1987* Flush database after each input file to ensure all changes from a file 1988 make it in. 1989 1990documentation: 1991 1992* docs/omegascript.txt: Clarify $field description slightly. 1993 1994* docs/cgiparams.txt,docs/omegascript.txt: Fixed 3 references to OmXxxx classes. 1995 1996* docs/termprefixes.txt: Added a single document covering all aspects of term 1997 prefixes. 1998 1999* docs/omegascript.txt: Moved $collapsed into correct place alphabetically! 2000 2001* docs/cgiparams.txt,docs/overview.txt: Improved description of how B filters 2002 are handled when building the query. 2003 2004* docs/scriptindex.txt: Note that actions are applied in the specified order. 2005 2006Omega 0.8.5 (2004-12-23): 2007 2008* README,INSTALL: Proper installation instructions. 2009 2010* omega: If an exception is thrown, make sure that the HTTP headers 2011 get written so that we don't cause "500 Internal Server Error". 2012 This problem was introduced by the change to allow a user specified 2013 Content-Type in 0.8.0. Partly addresses bug#60. 2014 2015* scriptindex: Fixed "Unknown Exception" when trying to "unhtml" text which 2016 contains "</body>" (bug#61). This bug was introduced in 0.8.4. 2017 2018* omindex/scriptindex: <h1> - <h6> and </h1> - </h6> now leave a space in the 2019 dumped HTML. This bug was introduced in 0.8.4 - before that any tag left 2020 a space in the dumped HTML. 2021 2022* omindex: Only try to delete removed documents in "replace duplicates" mode 2023 (which is the default). 2024 2025* omindex: Change behaviour of crawler such that it doesn't follow symbolic 2026 links any more. The new "--follow" command line option turns following of 2027 symlinks back on. 2028 2029* dbi2omega: Add a comment to the start of the file detailing what 2030 dbi2omega does. 2031 2032Omega 0.8.4 (2004-12-08): 2033 2034* omindex,scriptindex: Improved HTML to text conversion - now we strip 2035 leading and trailing whitespace and convert all other consecutive groups of 2036 whitespace to a single space. Also the parser now knows that some tags 2037 should be regarded as word breaks and some shouldn't (previously all tags 2038 were treated as word breaks). 2039 2040* omindex: Removed bogus extra line from code which was meant to 2041 truncate samples, titles, etc at a word boundary, but has never actually 2042 worked! 2043 2044* omindex: Added hooks for indexing the following formats: OpenOffice (requires 2045 unzip), MS Word (requires antiword), Wordperfect (requires wpd2text), RTF 2046 (requires unrtf). 2047 2048* omindex: If a filename to be passed to a filter program has a leading "-", 2049 protect it from possible interpretation as an option by prepending "./". 2050 2051* omega: When there's only a boolean query we promote it to be the query. 2052 Tweaked so we use boolean weights in this case. 2053 2054* omega: Use Query::empty() instead of the now deprecated Query::is_empty(). 2055 2056* omega,omindex,scriptindex: Use the new Database/WritableDatabase 2057 constructors. 2058 2059* templates/godmode: Finished off godmode template. 2060 2061* Compile everything as C++. 2062 2063* Check snprintf actually works - some older versions don't implement C90 2064 snprintf semantics. 2065 2066* XAPIAN_FLAGS already links with xapianqueryparser so remove 2067 -lxapianqueryparser from omega_LDADD as it was causing link errors on cygwin. 2068 2069Omega 0.8.3 (2004-09-20): 2070 2071* scriptindex: --version now actually reports the version. --help now exits 2072 with status 0 rather than status 1. 2073 2074* RPM packaging: Updated. The most notable change is that the RPM is now 2075 called xapian-omega because there's already an omega RPM (in Fedora Core at 2076 least) which is a game. Also htdig2omega and htdig2omega.script are now 2077 included in the RPM. 2078 2079* Install htdig2omega.script in ${prefix}/share/omega/ rather than 2080 ${prefix}/share/. 2081 2082Omega 0.8.2 (2004-09-13): 2083 2084* omega: $highlight now handles accented characters (bug#9). 2085 2086* omega: Use new checkatleast parameter to Enquire::get_mset to implement 2087 MINHITS. 2088 2089* omindex: When running with "replace duplicates" mode (the default), detect 2090 documents removed since the last indexing run and delete them from the 2091 database (bug #34). 2092 2093* omindex: Use the new WritableDatabase::replace_document(term, doc) method. 2094 2095* scriptindex: Report index script file name and line number when 2096 reporting errors in it. Added warning for redundant actions, 2097 such as "truncate" as the last action in a rule. 2098 2099* templates/query: Always report if the database is not found - previously we 2100 only did so if there was a query. 2101 2102* templates/query: Fixed missing </center> tag which happened in certain cases. 2103 2104* docs/omegascript.txt: Added note about that $add{$hit,1} gives 2105 the "hit number". 2106 2107* Now includes htdig2omega and htdig2omega.script which allow you to crawl 2108 remote websites with ht://dig, then build a searchable index of them with 2109 Xapian and Omega. 2110 2111* Link with -lxapianqueryparser, not -lomqueryparser. 2112 2113Omega 0.8.1 (2004-06-30): 2114 2115* omindex: Renamed hash() to hash_string() to avoid colliding with something 2116 on IRIX. 2117 2118* omega: Changed MORELIKE to pick up to 40 terms, rather than up to 6 (feedback 2119 on the mailing list suggests this gives much better results). 2120 2121* scriptindex: Added explicit catch for std::bad_alloc. 2122 2123Omega 0.8.0 (2004-04-19): 2124 2125* scriptindex: Change default to *not* overwriting the database (use 2126 --overwrite if you really want to do this); -u is now accepted but ignored. 2127 2128* scriptindex: Use getopt for option parsing. 2129 2130* omindex: Added --overwrite option which forces an existing database to be 2131 deleted before indexing begins. 2132 2133* templates/xml: Correct spelling of `relavence' to `relevance'. NB: if you're 2134 parsing the XML output, you'll need to fix this spelling in your parser! 2135 2136* templates/xml: Now set HTTP header: "Content-Type: application/html". 2137 2138* templates/xml: Remove unused OmegaScript code: 2139 `$set{topterms,$or{$ne{$msize,0},$query}}'. 2140 2141* indextext.cc,omindex.cc,scriptindex.cc: Updated to use add_term() instead of 2142 add_term_nopos(). 2143 2144* omega: Added $httpheader Omegascript to allow arbitrary HTTP headers and 2145 alternative Content-Type headers to be specified. 2146 2147* omega: If the probabilistic query was bad, don't try to run the match. 2148 2149* omega: Don't crash if there's a date filter but no probabilistic query. 2150 2151* omindex/scriptindex: Raw terms with a multicharacter prefix are now indexed 2152 with a : inserted (e.g. as XFOO:Rterm). This matches what the query parser 2153 does. 2154 2155* omindex/scriptindex: Don't create R terms for terms which start with a digit. 2156 2157* omindex: Use O_STREAMING and/or posix_fadvise() when reading files to be 2158 indexed (if available). This helps to keep the Xapian database in cache, 2159 and should greatly improve indexing throughput. 2160 2161* docs/scriptindex.txt: Make more explicit that boolean produces a *single* 2162 boolean term. 2163 2164* docs/cgiparams.txt: Note that START and END should be in the format YYYYMMDD. 2165 2166For NEWS entries for Omega versions prior to 0.8.0, see the xapian-core NEWS 2167file. 2168