1Omega 1.2.25 (2017-09-26):
2
3documentation:
4
5* Suggest DOCIDORDER=X for DONT_CARE.
6
7portability:
8
9* Fix GCC -Wimplicit-fallthrough warning.
10
11Omega 1.2.24 (2016-09-16):
12
13build system:
14
15* Drop unused configure check for symbol visibility.
16
17Omega 1.2.23 (2016-03-28):
18
19documentation:
20
21* Update links to Xapian website and trac to use https, which is now supported,
22  thanks to James Aylett.
23
24indexers:
25
26* Fix HTML/XML entity decoding to be O(n) not O(n²) - processing HTML/XML with
27  a lot of entities is now much faster.
28
29templates:
30
31* Remove unused country code to name maps.  These were intended as examples,
32  but they aren't very useful as such, and really just bloat the templates
33  needlessly.
34
35Omega 1.2.22 (2015-12-29):
36
37documentation:
38
39* Stop maintaining ChangeLog files.  They make merging patches harder, and stop
40  'git cherry-pick' from working as it should.  The git repo history should be
41  sufficient for complying with GPLv2 2(a).
42
43* Clarify help text for omindex --mime-type option.
44
45* docs/omegascript.rst:
46
47  + Fix documentation of $last to say it's the MSet index *one beyond* the end
48    of the current page.  Reported by Andrew Chilton.
49
50  + Clarify that $split and $substr work in bytes.  Previously we said
51    "characters" which could be taken as meaning they work with UTF-8
52    characters.
53
54  + Update documentation for $filters - it was missing these CGI parameters
55    from the list of those serialised: COLLAPSE, DOCIDORDER, SORT, SORTREVERSE,
56    SORTAFTER
57
58  + Explicitly note user can use $setmap to create their own maps.
59
60* docs/overview.rst:
61
62  + SVG extraction is built-in too.
63
64  + Expand paragraph about command `false`.  Note the versions where explicit
65    support was added, and that this will also work with any version on Unix,
66    where `false` is a command.
67
68  + Document `cdb_dir`.
69
70* docs/cgiparams.rst: Document behaviour if xDB is not set.
71
72* Change "characters" to "bytes" in a few places to clarify that we don't mean
73  Unicode code points.
74
75indexers:
76
77* omindex:
78
79  + Add '--title-size' option.
80
81  + Handle .oft the same way as .msg - it's some sort of template email, and
82    has essentially the same format.
83
84omega:
85
86* Make $querydescription ensure the match has been run, so that it includes
87  filters.
88
89* Avoid $allterms, $cgilist, $filterterms and $terms being O(n²) in the number
90  of items in the returned list.
91
92* If xFILTERS is not set, don't force the first page as that's unhelpful if
93  someone fails to set it in their template.
94
95* When environment variable SERVER_PROTOCOL is set to INCLUDED (as it is when
96  we're being included in a page), we already suppress the HTTP headers, but
97  now we suppress the blank line after the header too.
98
99* Support option flag_cjk_ngram if built against xapian-core >= 1.2.22.
100
101testsuite:
102
103* Add test coverage for parsing of HTML entities.
104
105build system:
106
107* Fix error reporting if PCRE isn't installed. Fixes #693, reported by lhz7370.
108
109portability:
110
111* Avoid warning when building with glibc >= 2.21.
112
113* Don't provide our own implementation of sleep() under __WIN32__ if there
114  already is one - mingw provides one, and in some situations it seems to clash
115  with ours.  Reported to xapian-discuss by John Alveris.
116
117* Stop trying to use O_STREAMING - the patch to implement it was never merged
118  into the Linux kernel, and I can't find any evidence that other platforms
119  implement it.  The constant value O_STREAMING used now seems to be used for
120  the part of O_SYNC which isn't covered by O_DSYNC, which seems likely to hurt
121  performance if anything.
122
123Omega 1.2.21 (2015-05-20):
124
125documentation:
126
127* docs/overview.rst: Document 'E' prefixed boolean terms for filtering by
128  extension (see #668, reported by bramvdh).
129
130* docs/encodings.rst: Add a document about character encoding, as suggested by
131  James Aylett in #550.
132
133indexers:
134
135* omindex:
136
137  + outlookmsg2html: Fix handling of message/rfc822 subparts.
138
139omega:
140
141* $prettyurl now decodes valid UTF-8 sequences, and some additional ASCII
142  characters in the path part: []@!$&'()*+.;= (Fixes #550 and #644, reported by
143  catkin and terencz.)
144
145* $prettyurl now leaves the query and fragment parts of the URL alone and won't
146  decode an escaped "/" (omindex doesn't create URLs with any of these, so we
147  only risk breaking other URLs which have them).
148
149* Drop compilation date and time from output when run from the command line -
150  they prevent reproducible builds and the version number is sufficient
151  information.
152
153templates:
154
155* templates/query: When listing matching terms, don't make the commas italic.
156
157* templates/query: Eliminate blank line before <html>.
158
159* templates/xml: Add XML declaration.
160
161* templates/godmode: Specify charset utf-8 in the content-type.
162
163build system:
164
165* Link test programs with libtool's '-no-install' or '-no-fast-install', like
166  we already do in xapian-core, which means that libtool doesn't need to
167  generate shell script wrappers for them on most platforms.
168
169portability:
170
171* Add spaces between literal strings and macros which expand to literal strings
172  for C++11 compatibility.
173
174* Remove 'register' as it's deprecated and clang spits out warnings because of
175  that.  Any modern compiler likely just ignores it as an optimisation hint
176  anyway.
177
178Omega 1.2.20 (2015-03-04):
179
180documentation:
181
182* docs/cgiparams.rst: Improve wording of docs for SORT parameter.
183
184* docs/omegascript.rst: Update documentation references to DATE1, DATE2, and
185  DAYSMINUS which were renamed in 0.6.x and the compatibility aliases removed
186  in 1.0.0.
187
188indexers:
189
190* omindex:
191
192  + Ignore extensions .msi and .msp, which are Microsoft installer files, but
193    which libmagic sometimes incorrectly identifies as application/msword.
194
195  + Interpret a command of "false" in "--filter" as meaning to ignore files
196    with that MIME type.
197
198omega:
199
200* Handle CGI parameter [=0 as [=1.
201
202templates:
203
204* templates/xml: Update handling of DATE1, DATE2 and DAYSMINUS which were
205  renamed in 0.6.x and the compatibility aliases removed in 1.0.0.
206
207build system:
208
209* configure: Use pkg-config in preference to determine flags needed to
210  compile and link with PCRE, as this will just work when cross-compiling
211  (at least under MXE).
212
213* configure: Define MINGW_HAS_SECURE_API under mingw to get _putenv_s()
214  declared in stdlib.h.
215
216* Enable automake option 'subdir-objects' to avoid warning from newer automake.
217
218portability:
219
220* Avoid doing link tests with libmagic in configure as they fail on mingw due
221  to not automatically picking up libraries which libmagic itself depends on.
222
223Omega 1.2.19 (2014-10-21):
224
225documentation:
226
227* docs/overview.rst: Note that pdftotext is part of poppler as well as xpdf.
228  (Noted by Paul Wise)
229
230Omega 1.2.18 (2014-06-22):
231
232indexers:
233
234* omindex:
235
236  + Work around libmagic returning a MIME content-type of "Composite Document
237    File V2 Document[...]" or "application/CDFV2-corrupt" by returning a more
238    suitable filetype based on looking at the file's extension.
239
240  + The starting URL wasn't previously URL encoded.  In 1.3.2, this will be
241    fixed by URL encoding it as we do for the rest of the path, for the 1.2
242    branch we only URL encode it if it contains a character <= 31 or at least
243    one of '#', '%', ':' or '?'.  This avoids a one-off reindex of every
244    document in the database in cases which work OK in practice.
245
246  + When we skip a file because it exceeds the configured size limit, include
247    that size limit in the message.
248
249omega:
250
251* Add support for setting the query expansion scheme to use.
252
253portability:
254
255* Don't compile in unixperm.cc - it isn't currently used, and it fails to build
256  with mingw.  (fixes #635, reported by Alexis Denis)
257
258* Fix warning when built with GCC 4.7.2 using -Os.
259
260* Removed unused inline function, fixing compiler warning.
261
262Omega 1.2.17 (2014-01-29):
263
264documentation:
265
266* docs/overview.html: Add Abiword as an example use of --filter, based on patch
267  from Frank J Bruzzaniti (fixes#383).
268
269portability:
270
271* Fix "no previous declaration" warning on platforms which don't have
272  mkdtemp().
273
274Omega 1.2.16 (2013-12-04):
275
276indexers:
277
278* omindex:
279
280  + Fix off-by-one when finding documents to delete which would sometimes cause
281    omindex to fail to delete documents from the database when they weren't
282    refound during an index update.
283
284  + Decode dates in xlsx files.
285
286  + Ignore extensions 'adm', 'cur', and 'ico' by default.
287
288  + Group-readable files which are owner-readable but not world-readable should
289    still get a "readable by owner" term added.  Reported by Emmanuel Garette.
290
291build system:
292
293* Compress source tarballs with xz instead of gzip.
294
295* configure: Sync compiler warning flag machinery against xapian-core.  The
296  changes are special handling for clang, passing -fshow-column where
297  supported, and handling for new warning flags in GCC 4.6 and 4.7.
298
299Omega 1.2.15 (2013-04-16):
300
301omega:
302
303* Don't pointlessly link utf8convert.o into the omega CGI.
304
305Omega 1.2.14 (2013-03-14):
306
307indexers:
308
309* omindex:
310
311  + Correct "max" -> "min" when reserving space for shared strings in .xlsx
312    files.  This just means we now reserve a more appropriate amount of space
313    to start with.
314
315  + Ignore .com files by default.
316
317Omega 1.2.13 (2013-01-09):
318
319indexers:
320
321* omindex:
322
323  + Extracting text using external filters now works for filenames containing a
324    newline character - previously the newline got lost during escaping for the
325    shell.
326
327  + Fix segfault when -F option without a ':' is passed.
328
329  + Skip a file if we get a read error while calculating the MD5 checksum (used
330    for duplicate detection) - previously we used a checksum of the file up to
331    that point.
332
333  + Avoid rereading SVG and Atom files when we calculate their MD5 checksums.
334
335  + Improvement --help output and man page, most notably:
336
337    - Say explicitly that --sample-size accepts the same formats as --max-size.
338
339    - Note default size limit on files to index is unlimited.
340
341  + When generating a sample for a CSV file, limit the size we pre-allocate to
342    the CSV file size if that's smaller than the requested sample size, in case
343    the user sets that limit very high.
344
345omega:
346
347* Fix to decode %-encoded character at the end of the query string.
348
349build system:
350
351* INCLUDES is now deprecated in automake, so use AM_CPPFLAGS instead.
352
353Omega 1.2.12 (2012-06-27):
354
355No changes since 1.2.11 except to bump the version - this release was made to
356fix an incorrect library version information update in xapian-core 1.2.11.
357
358Omega 1.2.11 (2012-06-26):
359
360indexers:
361
362* Change HTML parser's handling of multiple <body> tags and of text outside of
363  <body> to match the behaviour of modern web browsers.  (ticket#599)
364
365* omindex:
366
367  + Add command line option to control the size of the document sample stored.
368    Patch from Mihai Bivol.
369
370  + Rework .xlsx parsing to substitute the shared strings into the positions
371    they are used in, so that the sample actually matches what appears in the
372    spreadsheet, and to index calculated cell contents.
373
374  + Improve handling of headers and footers in OpenDocument documents.
375
376  + pdftotext outputs a formfeed between each page, which messes up our "empty
377    body" check, so trim any trailing formfeeds before this check.
378
379build system:
380
381* Don't explicitly link indirect shared library dependencies on FreeBSD,
382  OpenBSD, and Solaris.
383
384Omega 1.2.10 (2012-05-09):
385
386indexers:
387
388* Add support for CDATA to HTML/XML parser.
389
390* omindex:
391
392  + Add --max-size option, based on patch from ndaley in ticket#587.
393
394  + Add support for atom feed files, patch from Mihai Bivol in ticket#595.
395
396  + If the document with the highest existing docid before the run was updated,
397    we were reporting it as "added", but now we correctly report it as
398    "updated".  (Backported from 1.3.0).
399
400  + Catch and report std::exception explicitly, so failing to allocate memory
401    is no longer reported as "Unknown exception".  (Backported from 1.3.0).
402
403* scriptindex:
404
405portability:
406
407* Fix to build with GCC 4.7 by adding cast to rlim_t to fix error about C++11
408  compatibility (reported by Gaurav Arora).
409
410Omega 1.2.9 (2012-03-08):
411
412documentation:
413
414* docs/overview.html:
415
416  + Document that libmagic is used to determine the MIME type if the extension
417    isn't known.  Partly addresses ticket#569.
418
419  + We now limit time as well as CPU and memory for external filters.
420
421indexers:
422
423* Our HTML parser now ignores sections bracketed by <!--UdmComment--> and
424  <!--/UdmComment-->, like we already do for <!--htdig_noindex-->.
425
426* omindex: Add more extensions to the default ignore list: bin dat db fon jar
427  lnk pyc pyd pyo sqlite sqlite3 sqlite-journal tmp ttf
428
429Omega 1.2.8 (2011-12-13):
430
431documentation:
432
433* scriptindex.cc: Add link to http://xapian.org/docs/omega/scriptindex.html to
434  --help output (and so also to the man page which is generated from this).
435
436* omegascript.html: Add note to discourage use of percentage scores.
437
438indexers:
439
440* omindex:
441
442  + If we don't get any data from an external filter for 5 minutes, give up -
443    it has probably ended up blocked indefinitely.
444
445  + Improve --help output (and man page which is generated from it).  Closes
446    bug#572.
447
448* scriptindex:
449
450  + If no rules are found in the index script, report an error and give up -
451    this is inevitably the result of a mistake, and adding empty documents to
452    the database isn't helpful.
453
454omega:
455
456  + Add new $prettyurl{} command which undoes RFC3986 URL escaping which
457    doesn't affect semantics in practice.  Partly addresses ticket#550.
458
459  + Replace URL decoder with new implementation which handles various corner
460    cases better.  Fixes bug#578.
461
462  + If CGI parameter P has trailing spaces, we now remove them all rather than
463    leaving one.
464
465templates:
466
467* templates/query: HTML escape topterms.
468
469* templates/godmode: HTML escape the contents of document values.
470
471* templates/query: Don't show the percentage score in the default template.
472
473testsuite:
474
475* Add new urlenctest unit test of URL encoding and decoding.
476
477portability:
478
479* configure: Sync changes from xapian-core: Don't pass -Wshadow for GCC < 4.1;
480  don't pass -Wstrict-null-sentinel for GCC 4.0.x; only enable symbol
481  visibility on platforms where it is supported.
482
483packaging:
484
485* xapian-omega.spec: Package outlookmsg2html helper.
486
487Omega 1.2.7 (2011-08-10):
488
489documentation:
490
491* docs/termprefixes.html: Document how to map a user prefix to multiple term
492  prefixes.
493
494* docs/overview.html: Improve documentation of htdig_noindex.
495
496omega:
497
498* Improve $version output from "Xapian - xapian-omega 1.2.7" to "xapian-omega
499  1.2.7".
500
501packaging:
502
503* xapian-omega.spec: We're ABI compatible within a release series so make
504  dependency on xapian-core-libs >= rather than =.
505
506Omega 1.2.6 (2011-06-12):
507
508documentation:
509
510* docs/omegascript.html: Correct the documentation of the colours used by
511  $highlight{}.
512
513* docs/overview.html: Add using unoconv as more complex example of using
514  --filter (ticket#324).
515
516templates:
517
518* templates/query:
519
520  + Make search query input type=search.
521
522  + Autofocus the search query input (using HTML autofocus attribute with
523    Javascript fallback for older browsers).  (ticket#544)
524
525portability:
526
527* Fix a compiler warning.
528
529Omega 1.2.5 (2011-04-04):
530
531documentation:
532
533* Add index page which links to all the other documentation pages.
534
535* INSTALL: Copy new Multi-Arch section from xapian-core/INSTALL.  Replace VPATH
536  section with better equivalent from Xapian-core/INSTALL.
537
538* docs/omegascript.html: Minor improvements.
539
540indexers:
541
542* The HTML parser no longer uses an exception to signify it has finished in
543  the normal case as exceptions are typically costly to handle.  In tests,
544  this made omindex ~0.23% faster when indexing a lot of HTML files.
545
546* omindex:
547
548  + Add --ignore-exclusions option, which will index HTML files despite meta
549    robots tags, etc - omindex is often used in environments where such
550    exclusions aren't relevant.
551
552  + Fix to compile with older versions of libmagic which don't have
553    MAGIC_MIME_TYPE (e.g. on Ubuntu hardy).
554
555  + Tell xls2csv to separate fields with spaces rather than commas, and not to
556    quote them.  Fixes indexing of numeric fields, and means we don't need to
557    use our CSV parser to get a sample.
558
559  + Add whitespace between chunks of text extracted from Microsoft Office 2007
560    formats to prevent words in adjacent chunks from being run together.
561
562  + Encode reserved characters in URLs - links to files with names containing
563    '#' and '?' now work.
564
565  + Handle .xlr extension the same way as .xls (later Microsoft Works versions
566    apparently produce such files which are really the same format).
567
568  + Index filename extension with new standard prefix E.
569
570  + Just report the mimetype as unknown instead of saying "unknown Office 2007
571    MIME subtype".
572
573  + Ignore *.css and *.js by default too.
574
575  + Messages reporting skipping files are now more consistent and always report
576    the filename.
577
578  + New --empty-docs option to allow documents we extract no body text from to
579    be indexed (existing behaviour), skipped, or reported and then indexed.
580
581omega:
582
583* Fix double Content-Type header in some error reporting situations (regression
584  introduced in 1.2.4).
585
586* Update $url's URL encoding to follow RFC3986.
587
588* Allow QueryParser flags to be set from OmegaScript (ticket#418).  The
589  FLAG_SPELLING_CORRECTION flag can now be set using
590  $opt{flag_spelling_correction,1} - the old $opt{spelling,true} way to
591  enable this flag still works, but it now deprecated.
592
593templates:
594
595* templates/emptydocs,templates/godmode,templates/opensearch,templates/query,
596  templates/xml: Add missing escaping.  Some of these instances may allow
597  cross-site scripting, so upgrading your templates is recommended, especially
598  if you have any sensitive cookies set on the domain Omega is running on.
599
600* templates/xml:
601
602  + Try $field{caption} (which is what omindex sets) before $field{title} when
603    getting a value for the hit tag's title attribute - this is consistent with
604    how the query template gets the title.
605
606  + Add new 'type' attribute which gives $field{type}.
607
608  + Add 'DBSize' attribute to <result> element.
609
610  + Fix double escaping of matching terms.  This is only likely to affect cases
611    where a matching term contains '&'.
612
613  + Remove support for undocumented HILITECLASS CGI variable.  There's no
614    evidence I can find using Google code search or web search that this has
615    been used anywhere, and it's difficult to handle escaping it properly in
616    the face of all the ways it could reasonably be used.
617
618portability:
619
620* Fix to compile on Microsoft Windows (ticket#350).
621
622Omega 1.2.4 (2010-12-19):
623
624documentation:
625
626* Minor documentation improvements.
627
628indexers:
629
630* Some iconv implementations (such as that on Mac OS X) don't handle many of
631  the commonly seen mis-punctuated charset names (e.g. UTF16, UTF_16).  We now
632  check for this if iconv fails, fix up the charset name, and retry.
633
634* The built-in character encoding converter now handles spaces in charset
635  names.
636
637* Use O_NOATIME if available and either the file is owned by the current euid,
638  or the current euid is 0 (i.e. we're running as root).  This avoids updating
639  the access time of files we index which saves time.  Fixes ticket#222.
640
641* Report get_description() for Xapian exceptions, which provides additional
642  information above get_msg().
643
644* Add boolean terms with add_boolean_term() so they get wdf of 0 and don't
645  contribute to document length.
646
647* omindex:
648
649  + Escape wildcard patterns being passed to unzip - in the unlikely event that
650    one of these matched files in or under the current directory, we might fail
651    to extract all the files we wanted to.
652
653  + Add explicit support for indexing CSV files (better samples than from
654    using '-Mcsv:text/plain').
655
656  + Add support for indexing .msg files from Microsoft Outlook (using the Perl
657    module Email::Outlook::Message.  (ticket#334)
658
659  + Improve --help for --mime-type option.
660
661  + Optionally use libmagic to detect MIME types for files for which we have no
662    extension mapping, which allows us to handle files with a misleading
663    extension, or no extension at all.  (ticket#114)
664
665  + Add new --filter option which allows the user to specify new filters
666    provided they return UTF-8 text on stdout.
667
668  + If a filter command isn't installed, previously we wouldn't try it again
669    for the same file extension - now we won't try it again for the same
670    mime-type.
671
672  + Index the leafname of the file (without any extension) as extra keywords.
673
674  + Extract author from HTML, OpenDocument, and PDF files.  Index it with an A
675    prefix, and add it as a field.
676
677  + Add support for indexing text and metadata from SVG files.
678
679  + Extract metadata from Microsoft Office 2007 file formats.
680
681  + Index text in headers and footers for .odt and .docx files.
682
683  + Use the CSV parser to generate a nicer sample for files of type
684    application/vnd.ms-excel.
685
686  + Add support for indexing Debian and RPM package files (ticket#493).
687
688  + Make the memory limit for filter processes the size of physical memory,
689    which is a little less arbitrary than 7/8 of this value (ticket#424).
690
691  + Under --duplicate=ignore, fix so that old documents which aren't seen get
692    deleted, which wasn't implemented before (to suppress this deletion, pass
693    -p as well).
694
695  + Rename the short option for --version from -v to -V for consistency with
696    scriptindex and many other packages, and to free up -v as the short option
697    for --verbose.  For backward compatibility, "omindex -v" is handled
698    specially and still reports the version.
699
700  + Add --verbose option, and disable the less interesting output unless it is
701    specified.
702
703  + Deprecate "--preserve-nonduplicates" in favour of new long option
704    "--no-delete" which does the same thing, but has a clearer name.
705
706  + The deletion of documents pass at the end of indexing is now more
707    efficient.  We track how many documents in the database we haven't seen so
708    we can stop once we've found them all (a particularly big improvement if
709    there are no documents to delete), and we now use a PostingIterator over
710    all documents which avoids needing to catch an exception for every gap in
711    the used document ids.
712
713  + Quietly ignore files with mimetype set to "ignore".  The initial list of
714    extensions set to ignore is: .a .dll .dylib .exe .lib .o .obj .so
715
716  + Index file owner and read permissions, to allow finding documents with a
717    particular owner, and so searches can be restricted to documents a user is
718    able to read.
719
720  + Add file size as a document value, so you can sort on it and filter by it.
721
722* scriptindex:
723
724  + Fix file descriptor leak if the LOADFILE action is used on something which
725    isn't a file.
726
727omega:
728
729* Make sure we write out HTTP headers when reporting an error early on.
730
731* Extend $field to take an optional DOCID argument, rather than always using
732  the context from $hitlist.
733
734* Add new $emptydocs command which returns a list of documents with doclength
735  zero.
736
737* Add support for size: range filtering.  Currently the end points of the range
738  have to be specified in bytes (e.g. size:102400..204800 for 100-200KB).
739
740templates:
741
742* templates/emptydocs: New template which lists documents with doclength zero.
743
744build system:
745
746* configure: Probe for any options needed to enable large file support.
747  Handling files >= 2GB isn't especially useful, but more importantly this is
748  needed to allow omindex to index files on filing systems with 64 bit inodes
749  on some platforms (e.g. 32-bit Linux).
750
751* Use -no-undefined on platforms which need it to dynamically link such as
752  cygwin (need to do this taken from ticket#282).
753
754portability:
755
756* Fix to compile with Sun C++.
757
758Omega 1.2.3 (2010-08-24):
759
760documentation:
761
762* docs/termprefixes.html: Update "flint and quartz" to "flint and chert" as
763  quartz is no longer supported.  Give exact term length limit for flint and
764  chert.
765
766packaging:
767
768* xapian-omega.spec: Don't run autoreconf - it's no longer required.
769
770Omega 1.2.2 (2010-06-27):
771
772portability:
773
774* Apply getopt portability fixes from xapian-core 1.2.0, fixing build failures
775  on Mac OS X (and probably some other platforms with non-GNU getopt
776  implementations). (ticket#469)
777
778Omega 1.2.1 (2010-06-22):
779
780This release includes all changes from 1.0.21 which are relevant.
781
782Omega 1.2.0 (2010-04-28):
783
784This release includes all changes from 1.0.20 which are relevant.
785
786build system:
787
788* configure: Tell libtool not to link in deplibs on platforms where we know
789  they aren't needed.
790
791* configure: On Linux, extract the library search path from ldconfig which
792  gives us the default entries reliably.
793
794Omega 1.1.5 (2010-04-15):
795
796This release includes all changes from 1.0.19 which are relevant.
797
798Omega 1.1.4 (2010-02-15):
799
800This release includes all changes from 1.0.18 which are relevant.
801
802omega:
803
804* Use the optimised integer to string conversion routines from xapian-core.
805
806Omega 1.1.3 (2009-11-18):
807
808This release includes all changes from 1.0.15-1.0.17 which are relevant.
809
810templates:
811
812* templates/query: If JavaScript is available, convert $field{modtime} to a
813  string on the client-side so that the timezone is correct.  If JavaScript
814  isn't available, fall back to the existing behaviour of using UTC.
815  (ticket#314)
816
817build system:
818
819* configure: Default to looking for xapian-config-1.1 unless XAPIAN_CONFIG is
820  specified.
821
822Omega 1.1.2 (2009-07-23):
823
824This release includes all changes from 1.0.14 which are relevant.
825
826indexers:
827
828* omindex:
829
830  + Handle the "macroenabled" versions of MS Office 2007 files too
831    (ticket#290).
832
833  + Extract pptx notesSlides and comments, if present.  (ticket#290).
834
835Omega 1.1.1 (2009-06-09):
836
837This release includes all changes from 1.0.13 which are relevant.
838
839indexers:
840
841* omindex:
842
843  + Check the last modification time of files before reindexing (ticket#342).
844
845  + Add "--spelling" option to index spelling correction data.
846
847* scriptindex:
848
849  + Add new "spell" action for indexing spelling correction data (ticket#296).
850
851omega:
852
853* Add $suggestion and $opt{spelling} to provide access to spelling correction
854  (ticket#296).
855
856* Add $opt{weighting} to allow the weighting scheme and parameters to be
857  specified (ticket#298).
858
859* If SERVER_PROTOCOL in the environment is set to INCLUDED, then our output is
860  being included in another page (e.g. using SSI) so suppress the output of any
861  HTTP headers.
862
863templates:
864
865* templates/query: Offer any spelling correction QueryParser gives.
866
867build system:
868
869* configure: Sync warning flags used with GCC with xapian-core apart from
870  -Woverloaded-virtual which fires for MyHtmlParser::parse_html().  That
871  probably should be tidied up at some point, but not right now.
872
873Omega 1.1.0 (2009-04-23):
874
875indexers:
876
877* scriptindex:
878
879  + Make deprecated "index=nopos" an error.
880
881omega:
882
883* New OmegaScript command $transform{} which performs regular expression
884  substitutions using the PCRE library (which is now required to build Omega).
885  (ticket#231)
886
887build system:
888
889* The build system is now bootstrapped with newer versions of autoconf and
890  libtool which should produce smaller files and speed up configure and
891  make.
892
893Omega 1.0.23 (2011-01-14):
894
895indexers:
896
897* omindex:
898
899  + Escape wildcard patterns being passed to unzip - in the unlikely event that
900    one of these matched files in or under the current directory, we might fail
901    to extract all the files we wanted to when indexing document formats like
902    OpenDocument which use a zip file container.
903
904  + The parser for OpenDocument metadata wasn't initialising its "state" field.
905    Often you'd be lucky and it would be initialised to zero, but this could
906    have caused misparsing of metadata in some cases.
907
908* scriptindex: Fix file descriptor leak if the LOADFILE action is used on
909  something that isn't a file.
910
911* If fstat() fails when trying to load a file, preserve the errno value from
912  the fstat call to report to the user.
913
914portability:
915
916* configure: Probe for any options needed to enable large file support.
917  Handling files >= 2GB isn't especially useful, but more importantly this is
918  needed to allow omindex to index files on filing systems with 64 bit inodes
919  on some platforms (e.g. 32-bit Linux).
920
921* Add -no-undefined to AM_LDFLAGS on platforms which need it to dynamically
922  link such as cygwin (need to do this taken from ticket#282).
923
924Omega 1.0.22 (2010-10-03):
925
926portability:
927
928* Fix to compile with Sun C++.
929
930Omega 1.0.21 (2010-05-18):
931
932portability:
933
934* Fix build failure in freemem.cc on Microsoft Windows.
935
936Omega 1.0.20 (2010-04-27):
937
938portability:
939
940* Fix build failure on Mac OS X and possibly some other platforms (regression
941  caused by fix for getopt-related warnings on Cygwin in 1.0.19).
942
943Omega 1.0.19 (2010-04-15):
944
945portability:
946
947* Fix getopt-related warning on Cygwin.
948
949Omega 1.0.18 (2010-02-14):
950
951indexers:
952
953* Make the default charset "utf-8" not "UTF-8" as we lower case explicitly
954  specified character sets to compare to see if we need to reparse.  Previously
955  XML documents which explicitly specified their character set as UTF-8 would
956  cause needless restart or the parser.
957
958* omindex:
959
960  + Increase the wdf boost for the document title from 2 to 5, since 2 isn't
961    really enough.
962
963* scriptindex:
964
965  + Don't abort with "Unknown Exception" if indexing is disallowed or we hit
966    </body> for a document which had an overridden character set.  Fixes
967    ticket#410.
968
969Omega 1.0.17 (2009-11-18):
970
971indexers:
972
973* omindex:
974
975  + On Linux, change the memory limit on external filters to use _SC_PHYS_PAGES
976    since _SC_AVPHYS_PAGES excludes pages used by the OS cache and so will
977    often report a really low value.  Fixes Debian bug#548987 and ticket#358.
978
979  + Fix likely crash when reading output from external filter program if read()
980    is interrupted by a signal.
981
982  + Fix potential crash when indexing PostScript files (fixed by using delete[]
983    (not delete) for array allocated by new[]).
984
985testsuite:
986
987* utf8converttest: Charset "8859_1" isn't understood by Solaris libiconv, and
988  isn't a standard charset name, so just test it when using our built-in
989  converter and GNU libc.
990
991portability:
992
993* Fix build failure on Mac OS X 10.6.
994
995* Also check for socketpair() in -lxnet if it isn't found without, which
996  enables resource limits on external filter programs called by omindex on
997  Solaris, and possibly some other platforms.  Fixes ticket#412.
998
999Omega 1.0.16 (2009-09-10):
1000
1001* omega: Fix cross-site scripting vulnerability in reporting of exceptions
1002  (CVE-2009-2947).
1003
1004Omega 1.0.15 (2009-08-26):
1005
1006general:
1007
1008* omegascript.vim: The list of OmegaScript commands in the vim mode was rather
1009  out of date, and a few commands were misclassified.  Fix both problems and
1010  avoid future recurrences by automatically generating those lists from the
1011  command list in query.cc.
1012
1013documentation:
1014
1015* omegascript.html: Document that $date uses UTC.  (ticket#314)
1016
1017templates:
1018
1019* query: Link to "xapian.org" rather than "www.xapian.org".
1020
1021* inc/toptermsjs: Use double-quotes rather than single quotes for parameter
1022  values on the <script> tag.
1023
1024portability:
1025
1026* omindex: Implement correct handling of paths when calling external filter
1027  programs on Microsoft Windows.
1028
1029Omega 1.0.14 (2009-07-21):
1030
1031indexers:
1032
1033* omindex: Make sure that output is flushed after every message, not just after
1034  some of them.
1035
1036portability:
1037
1038* Avoid infinite loop in omindex and scriptindex when reading files under
1039  Cygwin with automatic end of line translation enabled.  This same bug can
1040  also manifest on Unix platforms if the file is truncated by another process
1041  while being read.
1042
1043Omega 1.0.13 (2009-05-23):
1044
1045indexers:
1046
1047* omindex:
1048
1049  + If the filter program needed for a file format isn't installed, report this
1050    explicitly when skipping subsequent files with the extension instead of
1051    misleadingly reporting "Unknown extension".
1052
1053  + Make -s actually work as a short-form for --stemmer (as documented by
1054    "omindex --help" and "man omindex").
1055
1056  + Drop the copyright info from the output of --version as it's perennially
1057    out of date and we don't report it for any other Xapian programs.
1058
1059* scriptindex:
1060
1061  + Add new "valuenumeric" action to add a document value using
1062    Xapian::sortable_serialise() to allow numeric sorting (ticket#260).
1063
1064build system:
1065
1066* configure: Enable more GCC warnings - "-Wstrict-null-sentinel" for 4.0+,
1067  "-Wlogical-op -Wmissing-declarations" for 4.3+.
1068
1069Omega 1.0.12 (2009-04-19):
1070
1071omega:
1072
1073* $log now retries a partial write, or one interrupted by a system call.
1074
1075build system:
1076
1077* configure: Fix iconv parameter type probe not to implicitly cast a string
1078  literal to char* - this a warning under GCC currently, but the user could
1079  pass -Werror explicitly in CXXFLAGS, and this could be promoted to an error
1080  in future GCC versions, and may already be so for some other compilers.
1081
1082* Overriding CXXFLAGS at make-time (e.g. "make CXXFLAGS=-Os") no longer
1083  overrides any flags required for building with Xapian.
1084
1085* We now actually use the compiler warning flags which configure detects.
1086
1087Omega 1.0.11 (2009-03-15):
1088
1089documentation:
1090
1091* cgiparams.html: Note the technique of using a stub database file to allow a
1092  default of searching over multiple databases.
1093
1094indexers:
1095
1096* omindex:
1097
1098  + Add support for indexing Microsoft Office 2007 formats and XPS files
1099    (bug#290).
1100
1101  + Fix the extraction of metadata from OpenDocument formats.
1102
1103  + Fix "-l" which would previously always cause a segmentation fault if used
1104    ("--depth-limit" wasn't affected).
1105
1106build system:
1107
1108* configure: The output of g++ --version changed format (again) with GCC 4.3
1109  which meant configure got "g++" for the version.  Instead use the (hopefully)
1110  more robust technique of using g++ -E to pull out __GNUC__ and
1111  __GNUC_MINOR__.
1112
1113* configure: Turn on _FORTIFY_SOURCE where available (as we do in xapian-core).
1114
1115portability:
1116
1117* Fix to compile when RLIMIT_AS isn't available (as on NetBSD and OpenBSD).
1118  Instead use RLIMIT_VMEM or RLIMIT_DATA if either is available, else don't try
1119  to limit the memory the filter process can use.
1120
1121Omega 1.0.10 (2008-12-23):
1122
1123build system:
1124
1125* This release now uses newer versions of the autotools (autoconf 2.62 ->
1126  2.63; automake 1.10.1 -> 1.10.2).  The newer autoconf fixes a regression
1127  in autoconf 2.62 (and so Omega 1.0.7) with detecting the endian-ness of some
1128  platforms.
1129
1130Omega 1.0.9 (2008-10-31):
1131
1132documentation:
1133
1134* docs/overview.html: Document HTML parsing a bit, including robots
1135  meta and htdig_noindex.
1136
1137omega:
1138
1139* omega: Catch std::exception and report what its what() method returns.
1140
1141* omega: Remove undocumented and non-functional support for numeric sorting
1142  via CGI parameter SORT=#<slot> (SORT=<slot> works as before).
1143
1144build system:
1145
1146* configure: Sync warning flag handling changes from xapian-core to eliminate
1147  many warnings from GCC 4.3.
1148
1149Omega 1.0.8 (2008-09-04):
1150
1151documentation:
1152
1153* Fix a few typos and improve wording in a few places.
1154
1155indexers:
1156
1157* omindex:
1158
1159  + If the character encoding is specified using <meta http-equiv=...> in an
1160    HTML document then reparse the document if it isn't the encoding we're
1161    already using so that any preceding <title> is converted correctly
1162    (bug#292).
1163
1164  + Convert text from meta tag parameters to UTF-8 (bug#293).
1165
1166  + Handle <meta charset="..."> (new in HTML 5).
1167
1168  + Fix bug in HTML tag parameter parsing which was probably just a small
1169    performance penalty in real world cases, but could perhaps result in
1170    parsing bogus extra parameters in carefully contrived situations.
1171
1172portability:
1173
1174* Add missing <signal.h>, noted on FreeBSD by Henrik Brix Andersen.
1175
1176Omega 1.0.7 (2008-07-14):
1177
1178documentation:
1179
1180* omegascript.html,scriptindex.html: Fix empty titles.
1181
1182indexers:
1183
1184* omindex:
1185
1186  + When indexing text files, handle UCS-2 and UTF-16 text files with a
1187    byte-order mark (BOM), and ignore any UTF-8 "byte-order" mark.
1188
1189  + The built-in conversion code (used when iconv isn't available) now handles
1190    UCS-2/UTF-16 with and without a BOM, and also the explicit BE and LE forms.
1191
1192omega:
1193
1194* Overhaul the $highlight colour combinations since some were rather
1195  unreadable (Debian bug 484456).
1196
1197build system:
1198
1199* configure: Synchronise code for working out warning flags used for builds
1200  with that used for xapian-core, which in particular handles different
1201  output formats from "gcc --version".
1202
1203portability:
1204
1205* configure: Fix header checks to pre-include <sys/types.h> which Mac OS X
1206  needs for some other headers to work.
1207
1208* configure: Fix probing for iconv to work better when iconv isn't found
1209  (previously this only worked on Mac OS X with fink).
1210
1211* Fix compilation error on FreeBSD, introduced in 1.0.5.
1212
1213* In omega, cast size to unsigned before division to avoid a warning about
1214  signed overflow.
1215
1216packaging:
1217
1218* xapian-omega.spec: Remove "www." from xapian.org and oligarchy.co.uk URLs.
1219
1220Omega 1.0.6 (2008-03-17):
1221
1222documentation:
1223
1224* docs/omegascript.html: Improve formatting.
1225
1226indexers:
1227
1228* omindex:
1229
1230  + Add support for DjVu files.
1231
1232  + If we get an error trying to read a directory entry, report it to the user
1233    rather than ignoring it.
1234
1235omega:
1236
1237* New OmegaScript commands $addfilter, $lower, $upper.
1238
1239portability:
1240
1241* Check "defined HAVE_SYSMP" rather than just "HAVE_SYSMP".  This doesn't
1242  change behaviour, but fixes a compile warning on platforms other than Linux
1243  and IRIX.
1244
1245Omega 1.0.5 (2007-12-21):
1246
1247documentation:
1248
1249* Convert .txt docs to reStructedText which we process to produce HTML.
1250
1251* Add a note inviting suggestions for additional reliable filter programs.
1252
1253* overview.html: omindex hasn't generated "W"-prefix terms since 0.9.7, so
1254  remove the documentation saying it does.
1255
1256indexers:
1257
1258* omindex:
1259
1260  + If a file's extension isn't found in the mime_map and contains uppercase
1261    ASCII characters, check for the lower cased extension (so .PDF and .Pdf
1262    behave the same way as .pdf, unless you deliberately add different mappings
1263    for them).
1264
1265  + '-f' is documented by --help as a short option for '--follow', but wasn't
1266    previously actually recognised.
1267
1268  + Limit filter programs to 7/8 of free physical memory on platforms where we
1269    know how to determine this statistic (currently at least Linux, FreeBSD,
1270    IRIX, HP-UX; probably Solaris and a few others too).  This helps to prevent
1271    runaway filters from causing a denial of service (bug#111).
1272
1273  + Avoid rereading uncompressed AbiWord documents in order to calculate their
1274    MD5 checksums.
1275
1276* scriptindex:
1277
1278  + Now inserts a ':' between prefix and term, using the same criteria which
1279    Xapian::QueryParser does.
1280
1281  + The 'BOOLEAN' action now ignores an empty input rather than adding just the
1282    prefix as a term.
1283
1284  + The 'UNIQUE' action now issues a warning for empty input but otherwise
1285    ignores it.
1286
1287portability:
1288
1289* Add explicit includes of C headers needed to build with the latest snapshots
1290  of GCC 4.3.
1291
1292Omega 1.0.4 (2007-10-30):
1293
1294omega:
1295
1296* If an OmegaScript template specifies the same field name as both a boolean
1297  and a probabilistic term prefix then previous the boolean setting would
1298  be ignored (e.g. $setmap{prefix,foo,A}$setmap{boolprefix,foo,H}).  Now this
1299  generates an error.  If you set prefixes in your templates, you may wish to
1300  check them over before upgrading.
1301
1302Omega 1.0.3 (2007-09-28):
1303
1304general:
1305
1306* Distribution tarballs are now in the POSIX "ustar" format since it saves
1307  a few KB and we need to use it for xapian-core anyway.
1308
1309documentation:
1310
1311* Expand the output of 'mbox2omega --help' and refer the reader to it from
1312  docs/scriptindex.txt.
1313
1314indexers:
1315
1316* omindex:
1317
1318  + Add support for indexing AbiWord documents and TeX DVI files.
1319
1320  + Impose a 5 minute CPU time limit on filter programs to prevent problems if
1321    a filter program goes into an infinite loop on a malformed input.  Partly
1322    addresses bug#111.
1323
1324* scriptindex:
1325
1326  + Fix line number tracking in dump files.
1327
1328omega:
1329
1330* Add $muldiv{A,B,C} which calculates int(A*B/C).
1331
1332* Fix bug in decimal fraction in $size for files >= 1M in size.
1333
1334templates:
1335
1336* query:
1337
1338  + Set HTML charset to utf-8 since that's what databases now are by default.
1339
1340  + Restyle to use CSS to draw a "score bar" instead of using images.
1341
1342  + Rework the layout of each hit.
1343
1344  + Add popup hints on mouse-over for various items.
1345
1346  + Tidy up some HTML gremlins.
1347
1348Omega 1.0.2 (2007-07-05):
1349
1350documentation:
1351
1352* scriptindex.txt: Fix typo.
1353
1354indexers:
1355
1356* omindex:
1357
1358  + If --url isn't passed, default to "/", but print a warning noting that this
1359    default has been used (at least for now).
1360
1361  + Report files that aren't indexed because their extensions aren't
1362    recognised.
1363
1364build system:
1365
1366* Value of XAPIAN_CONFIG supplied to configure is now passed to distcheck,
1367  to ensure that it works with uninstalled copies of Xapian.
1368
1369portability:
1370
1371* Fix test programs to build with a development snapshot of GCC 4.3.
1372
1373Omega 1.0.1 (2007-06-11):
1374
1375documentation:
1376
1377* overview.txt: As of 1.0.0, we no longer use pstotext for PostScript, but
1378  instead use ps2pdf followed by pdftotext (since this works for Unicode).
1379
1380* scriptindex.txt: Document that you can delete a document by supplying a new
1381  document which only contains the unique term.
1382
1383indexers:
1384
1385* Fix bug in HTML parser - if the text between two tags consisted entirely of
1386  whitespace it would just be ignored which could run words together if
1387  the tags didn't produce implicit whitespace.  This bug dates back to at least
1388  Omega 0.8.2.
1389
1390* omindex: Under Linux (and probably some other platforms) struct dirent can
1391  tell us the type of a directory entry for some filing systems, so make use of
1392  this to avoid calling stat() (or lstat()) unnecessarily - when indexing
1393  /usr/share/doc on my Linux box, this saves about 14000 explicit calls to
1394  stat() (leaving about 7000).
1395
1396omega:
1397
1398* Fix handling of query parsing errors (broken by changes in 1.0.0).
1399
1400packaging:
1401
1402* The required automake version has been lowered to 1.8.3, so RPMs can now be
1403  built on RHEL 4 and SLES 9.
1404
1405Omega 1.0.0 (2007-05-17):
1406
1407general:
1408
1409* Omega and the indexers now work in UTF-8.  If iconv() is available, omindex
1410  will use it to convert documents from other formats, otherwise it has
1411  built-in support for UTF-8 and ISO-8859-1; omindex knows how to run the
1412  various external filter programs to generate UTF-8 output; scriptindex
1413  assumes input is already in UTF-8.
1414
1415* Change the project name (used to name tarballs, and default installation
1416  paths) to "xapian-omega" since that's what the RPMs and Debian packages
1417  already use (there's a Rogue-like game called Omega).
1418
1419documentation:
1420
1421* docs/overview.txt: Document what each of the OmegaScript templates does.
1422
1423* docs/quickstart.txt: Assorted minor improvements.
1424
1425* docs/termprefixes.txt: Document new 'Z' prefix, and that the 'R' and 'W'
1426  prefixes are no longer used by Xapian.
1427
1428* docs/cgiparams.txt: FMT isn't limited to just `a-z' - the actual restriction
1429  is that it may not contain `..'.
1430
1431* docs/scriptindex.txt: Explicitly note that index=nopos is deprecated
1432  (scriptindex already emits a warning).
1433
1434* NEWS: Add note that Omega < 0.8.0 NEWS entries are in the xapian-core NEWS
1435  file.
1436
1437* TODO: Updated.
1438
1439indexers:
1440
1441* Updated to use the new Xapian::TermGenerator class.  This means that the
1442  indexing strategy has changed.
1443
1444* "--help" now reports the default stemming language (i.e. "english").
1445
1446* Implement new sample generating function which normalises all runs of
1447  whitespace to a single space, and fixes invalid UTF-8 in the sample.
1448
1449* omindex:
1450
1451  + We now index PostScript by converting to PDF with ps2pdf and then indexing
1452    that.  This allows us to index PostScript files containing Unicode
1453    characters outside of ISO-8859-1, and also means we now get metadata from
1454    PostScript files.  The downside is it is quite a bit slower.
1455
1456  + Add support for indexing MS Works documents using wps2text (part of
1457    libwps).
1458
1459  + Don't index empty files.
1460
1461* scriptindex:
1462
1463  + Fix optimisation of "load truncate=N" to actually work!
1464
1465  + The "truncate" action knows not to chop off a multibyte UTF-8 character.
1466
1467  + Update short option list for scriptindex to match documented usage (-h, -V
1468    and -s were not working).
1469
1470  + Remove -q and -u options - they no longer do anything and are only accepted
1471    for compatibility with really old versions (0.6.1 and earlier for -q; 0.7.5
1472    and earlier for -u).
1473
1474omega:
1475
1476* Add an alternative implementation of date range filtering which uses a
1477  MatchDecider.  This allows everything that the existing implementation does,
1478  plus you can support sorting on a choice of dates (e.g. first published or
1479  last updated), and filtering works to a resolution of a minute rather than a
1480  day.  Set CGI parameter DATEVALUE to enable this, and to specify the value to
1481  use.  Since omindex now adds the last modified date as value 0, this will
1482  work with omindex.
1483
1484* Enhance $substr{} to accept a negative length (meaning to count back from the
1485  end of the string).
1486
1487* New CGI parameters to allow finer control of sorting and ranking - SORTAFTER
1488  and DOCIDORDER.
1489
1490* The sorting options are now encoded in $filters so Omega can automatically
1491  reset to page 1 if they are changed.
1492
1493* Add new OmegaScript $weight command which returns the raw document weight -
1494  mostly useful for debugging purposes.
1495
1496* $topterms{} now generates unstemmed terms.
1497
1498* $prettyterm{TERM} has been updated to fit with changes to the term generation
1499  strategy.
1500
1501* Add 'you' and 'your' as stopwords.
1502
1503* $filesize{SIZE} enhanced to return a decimal point for K, M, and G (e.g.
1504  "2.1K" and "4.0M" rather than "2K" and "4M"); $filesize{0} is now "0 bytes";
1505  $filesize{1} is now "1 byte"; $filesize{SIZE} where SIZE is negative is now
1506  "".
1507
1508* Remove $freqs as it has been deprecated for ages.
1509
1510* Remove support for xB, xDATE1, xDATE2, xDAYSMINUS, and xDEFAULTOP which were
1511  deprecated in favour of xFILTER in 0.7.5 (over 3 years ago).
1512
1513* Remove deprecated aliases for CGI parameters (deprecated in 0.6.3 or 0.6.5,
1514  more than 3.5 years ago): RAW_SEARCH (now RAWSEARCH), DATE1 (now START),
1515  DATE2 (now END), DAYSMINUS (now SPAN but with slightly different semantics),
1516  and MIN_HITS (now MINHITS).
1517
1518* Remove "bias_weight" and "bias_halflife" CGI parameters since they rely on
1519  Enquire::set_bias() which has been removed.
1520
1521templates:
1522
1523* The 'query' template no longer uses $topterms by default.
1524
1525* New 'topterms' template provides a query template with $topterms support.
1526
1527* Template fragments which aren't intended for direct use have been moved to
1528  an "inc" subdirectory.
1529
1530testsuite:
1531
1532* md5test: Add tests for MD5 code.
1533
1534build system:
1535
1536* `./configure --enable-quiet' already allows you to specify at configure time
1537  to pass `--quiet' to libtool.  Now you can override this at make-time by
1538  using `make QUIET=' (to turn off `--quiet') or `make QUIET=y' (to turn on
1539  `--quiet').
1540
1541* configure: Disable probes for f77, gcj, and rc completely by preventing
1542  the probe code from even appearing in configure - this reduces the size of
1543  configure by 29% and should speed it up significantly.
1544
1545portability:
1546
1547* Fixed to build with GCC 4.3 snapshot.
1548
1549* We now make use of the safe*.h portability headers from xapian-core.
1550
1551* Ensure that the result of snprintf is zero terminated since MSVC's snprintf
1552  is broken (by design it seems).
1553
1554* configure: xapian-config --cxxflags now includes -ptused for SGI's C++
1555  compiler, so we don't need to probe for it here.
1556
1557* configure: Perform a link test for posix_fadvise to fix misdetection on
1558  HP-UX.
1559
1560Omega 0.9.10 (2007-03-04):
1561
1562documentation:
1563
1564* docs/omegascript.txt: Rewrite introductory paragraph.  Note that
1565  whitespace is significant, and add explicit warning to $setmap.
1566
1567* docs/termprefixes.txt: Expand section on boolean prefixes, showing
1568  how to generate them using scriptindex, and how to allow them to be
1569  selected in an HTML form.
1570
1571indexers:
1572
1573* omindex: Generate correct MD5 checksums on big-endian platforms.
1574
1575omega:
1576
1577* Fix $substr{} with negative start to actually work.
1578
1579* Fix $substr{} to never cause a C++ exception.
1580
1581packaging:
1582
1583* omega.spec.in: Remove "." from the end of the Summary.
1584
1585Omega 0.9.9 (2006-11-09):
1586
1587documentation:
1588
1589* Ship our custom INSTALL file rather than the generic one from autoconf which
1590  we've accidentally been shipping instead since 0.9.5.
1591
1592indexers:
1593
1594* scriptindex: The "date" action no longer modifies the value it operates on
1595  (it was never meant to!)
1596
1597omega:
1598
1599* Report an error if $setmap is called with an even number of parameters.
1600  An incorrect example in the documentation used to suggest this, so it's
1601  particularly useful to catch this case.
1602
1603packaging:
1604
1605* RPMs: Prevent binaries getting an rpath for /usr/lib64 on FC6.
1606
1607Omega 0.9.8 (2006-11-02):
1608
1609omega:
1610
1611* $substr where the start is negative and longer than the string (e.g.
1612  $substr{abcd,-5,1}) wasn't working as intended.
1613
1614build system:
1615
1616* configure: Tell AC_CHECK_HEADERS to suppress its backward compatibility mode,
1617  so it only checks headers with the compiler.  This speeds up configure a
1618  little, and is what we do elsewhere.
1619
1620* configure: Warning flags for GCC weren't actually getting used.  Fix this to
1621  work and use the same warning flags for GCC and Intel C++ as xapian-core does.
1622  Fix all the warnings this uncovered!
1623
1624* omega,omindex,scriptindex: Remove some old unused code.
1625
1626portability:
1627
1628* Ensure that we always pass an unsigned char value to isupper(), toupper(),
1629  etc as they are undefined on other values (glibc makes them work for signed
1630  char values too, but this is an extension).
1631
1632* configure: Pass magic options to SGI's C++ compiler to allow linking of
1633  templates to work.
1634
1635* configure: IRIX doesn't allow stdint.h to be included from C++ so we need
1636  a smarter configure test than AC_CHECK_HEADERS.
1637
1638* Fix warnings from SGI's C++ compiler.
1639
1640Omega 0.9.7 (2006-10-10):
1641
1642documentation:
1643
1644* omegascript.txt: Note that (by design) an omegascript template can't
1645  contain an infinite loop.
1646
1647* termprefixes.txt: "$setmap{title,S}" should be "$setmap{prefix,title,S}".
1648
1649* Use the default paths to the database directories and the omega CGI binary in
1650  examples.
1651
1652* README: Update reference to "CVS" to say "SVN".
1653
1654indexers:
1655
1656* Don't get confused by "a<b" in Javascript in a <script> tag.  Fixes bug#91.
1657
1658* Support htdig's "ignore this bit" comments.
1659
1660* Don't generate terms with more than 3 trailing symbols ('-', '+', or '#').
1661
1662* omindex:
1663
1664  + Add the file last modified time as value #0.
1665
1666  + Generate an MD5 checksum of each file indexed and store it in value #1
1667    to allow duplicates to be collapsed.
1668
1669  + Store the file's last modified time in the document data as "modtime" so it
1670    shows up in search results (and tweak the query template so the display of
1671    this information looks nicer).  Don't add "modtime" field if the timestamp
1672    is (time_t)-1.
1673
1674  + Run pdfinfo once and pull out the fields we want using string operations,
1675    instead of running it twice filtered through sed.
1676
1677  + Parse the XML from OpenDocument and OpenOffice using new subclasses of
1678    HtmlParser.  Only extract meta.xml once.
1679
1680  + Add "size" field to document data.
1681
1682  + Run xls2csv on MS Excel files, run catppt on MS Powerpoint files, and also
1683    index MS Word templates (.dot) the same way as .doc files.
1684
1685  + Don't generate 'W' terms since omega doesn't use them.
1686
1687  + If a filter program isn't installed, then don't try it again for the same
1688    extension (not perfect but an improvement - previously we indexed an empty
1689    document!)
1690
1691  + If popen() fails, treat it as a read error.
1692
1693* scriptindex:
1694
1695  + Add new "load" action to allow the contents of an external file to be
1696    loaded and parsed.
1697
1698  + Fix check for whether a record has content in the case where the same field
1699    is processed more than once.
1700
1701omega:
1702
1703* Add $pack and $unpack OmegaScript commands to allow big endian binary values
1704  to be encoded and decoded (for use with omindex's lastmod in value #1).
1705
1706* omega.conf: Fix code which reads omega.conf to be line based as documented
1707  rather than the wacky whitespace based scheme that was actually implemented.
1708  Also we now allow "#" comments and blank lines in omega.conf.
1709
1710* Fix $highlight{} to work with capitalised words (it used to work but
1711  regressed in 0.8.2).
1712
1713* Use '\t' to separate terms in xP since filter terms might contain '.'.  Fixes
1714  bug#87.
1715
1716testsuite:
1717
1718* Add htmlparsetest which tests the MyHtmlParser class.
1719
1720build system:
1721
1722* Makefile.am: Make use of the dist_ prefix to avoid having to list files in
1723  EXTRA_DIST as well as in *_SCRIPTS, *_DATA, and man_MANS.
1724
1725* Makefile.am: Prefer $(sysconfdir) to @sysconfdir@ since the former can be
1726  overridden on the "make" command line.
1727
1728portability:
1729
1730* xapian-config will now switch Sun's C++ compiler into ANSI C++ compliant
1731  mode, so remove all the special case bits of code added for just this one
1732  compiler.
1733
1734* omindex: Fix escaping of filenames to cast characters to "unsigned char" so
1735  that isalnum() works correctly everywhere.  Not a security hole as dangerous
1736  characters were still being escaped.
1737
1738* Call pclose() not fclose() on a FILE* obtained from popen().  This bug could
1739  cause us to run out of file descriptors on some platforms.
1740
1741* configure: Check for strftime.
1742
1743packaging:
1744
1745* omega.spec.in: Include documentation in the RPM package.
1746
1747Omega 0.9.6 (2006-05-15):
1748
1749documentation:
1750
1751* docs/omegascript.txt: Clarified description of $now.
1752
1753indexers:
1754
1755* scriptindex: Fix "index" and "indexnopos" without a prefix to set the weight
1756  correctly (bug introduced in 0.9.5).
1757
1758omega:
1759
1760* Added new OmegaScript commands $filterterms and $substr.
1761
1762portability:
1763
1764* configure: Update snprintf detection to match xapian-core.
1765
1766* Fix MSVC warnings.
1767
1768packaging:
1769
1770* omega.spec.in: Create and package /var/lib/omega/cdb and /var/log/omega.
1771
1772Omega 0.9.5 (2006-04-08):
1773
1774documentation:
1775
1776* README: Add pointer to documentation.
1777
1778* Added man pages for omindex and scriptindex, generated using help2man.
1779
1780indexers:
1781
1782* scriptindex:
1783
1784  + If we fail to open the index script, die with an error (previously we
1785    acted as if an empty file was specified).
1786
1787  + Warn about a useless "weight" action, even if it's followed by another
1788    non-useless action (e.g. "field") - previously we only warned if it
1789    was last or followed only by other useless actions.
1790
1791  + Warn if "unique=<prefix>" is used without a corresponding
1792    "boolean=<prefix>" on the same line.
1793
1794  + Warn that "index=nopos" is deprecated and should be replaced by
1795    "indexnopos".
1796
1797  + Add explanatory text "(note that actions are executed from left to right)"
1798    when reporting useless actions.
1799
1800  + Added new "hash" command to allow hashed terms to be generated from long
1801    URLs like omindex does.
1802
1803* htdig2omega.script,mbox2omega.script: Make use of the new scriptindex "hash"
1804  command.
1805
1806* dbi2omega: Check DBIDRIVER environmental variable to allow a driver other
1807  than mysql to be specified without modifying the script.
1808
1809omega:
1810
1811* Fix $opt[fieldnames] handling.  Previously it would try to kick in if you
1812  didn't set fieldnames but set any alphabetically later option!  The symptom
1813  was that $field{} would stop working (bug#72).
1814
1815portability:
1816
1817* omindex,omega: Tweaks for MSVC compilation.
1818
1819Omega 0.9.4 (2006-02-21):
1820
1821documentation:
1822
1823* COPYING: Updated FSF address.
1824
1825Omega 0.9.3 (2006-02-16):
1826
1827documentation:
1828
1829* overview.txt: The U prefix (URL term) was grouped with the date searching
1830  prefixes, but it makes more sense to group it with the prefixes relating to
1831  parts of the URL (H for hostname, P for path, etc).
1832
1833* overview.txt: Add pointer to documentation of the supported query syntax.
1834
1835* omegascript.txt: Improve descriptions of $cgi, $collapsed, $value, $version.
1836
1837* termprefixes.txt: Fix typo.
1838
1839indexers:
1840
1841* omindex: add --preserve-nonduplicates / -p option to not delete any documents
1842  that aren't updated, in replace duplicates mode (so that multiple runs of
1843  omindex on different subsites don't stomp on each other).
1844
1845* omindex,scriptindex: Add "--stemmer" option to omindex and scriptindex
1846  to allow the stemming language to be set.  Fixes bug#11.
1847
1848* omindex,scriptindex: More consistent --help and --version output.
1849
1850* omindex: Add support for OpenDocument format mimetypes and extensions out of
1851  the box.  Previously you could index them but had to pass a "-m" option for
1852  each OpenDocument filename extension you wanted to handle.
1853
1854* scriptindex: The "-q" option no longer actually controls anything.  Just
1855  ignore it for backwards compatibility (and don't document it in --help).
1856
1857omega:
1858
1859* If executing an OmegaScript command causes a Xapian exception to be thrown,
1860  catch it and copy the error message into error_msg (which is read by the
1861  $error command).  This allows such errors to reported in a nicer way.
1862
1863* Added "SORTREVERSE" CGI parameter which allows the sort order to be reversed
1864  when sorting on a value.  Removed "SORTBANDS" CGI parameter since it no
1865  longer does anything.
1866
1867* Added $find{LIST,STRING} to return the subscript of the first occurrence of
1868  string STRING in list LIST.
1869
1870* Added $lookup{CDBFILE,KEY} OmegaScript command to perform a lookup in a CDB
1871  file.
1872
1873* Added new feature which allows you to avoid storing fieldnames in every
1874  document.  Instead you just store the field values, one per line, and add
1875  something like "$set{fieldnames,$split{caption sample url}}" to the
1876  OmegaScript template to specify the fieldnames to use.  This can save a lot
1877  of disk space for a large database.
1878
1879* Add new "$split{}" OmegaScript command which splits a string to give an
1880  OmegaScript list.
1881
1882* Fix $url{} to escape "+" to "%2b".  Also fix encoding of top-bit-set
1883  characters on platforms where char is signed by default.
1884
1885* Speed up $highlight{} - only compare terms which are the same length.
1886
1887* Reduce memory usage if a lot of documents are marked as relevant.
1888
1889templates:
1890
1891* query: Make the page title shorter so there's more chance it will fit on icon
1892  bars, etc.
1893
1894* opensearch: Add missing escaping.
1895
1896* godmode: If a non-existent docid is specified, report the error and prompt
1897  the user to enter another docid.  Fixes bug#60.
1898
1899portability:
1900
1901* omega: Fix printf type mismatch on 64 bit platforms.
1902
1903* omega: Cast time_t to unsigned long to avoid problems on 64bit platforms.
1904
1905* Use snprintf where available.
1906
1907* Write top-bit set characters using \xXX notation to avoid warnings from
1908  Intel's C++ compiler.
1909
1910Omega 0.9.2 (2005-07-15):
1911
1912* omega: Changed $highlight so if OPEN and CLOSE aren't specified, they default
1913  to highlighting each word from the query with a different background colour
1914  like gmane does (previous default was to use '<strong>' and '</strong>').
1915
1916* omega: Call QueryParser::set_database() as this is now used to decide what to
1917  do for terms like "C#".
1918
1919* omega: Added the ability to set boolean prefixes for the QueryParser by
1920  setting a "boolprefix" map in the omegascript template.
1921
1922* omega: Added $length{} and $stoplist{} commands to OmegaScript.
1923
1924* scriptindex: Fix infinite loop if there's no newline at the end of a dumpfile.
1925
1926* docs/termprefixes.txt: Explain how to use termprefixes with scriptindex and
1927  omega, since that's what most people will want to know.
1928
1929* docs/omegascript.txt: Use standard "S" prefix for title in example for
1930  $setmap, rather than "XT".
1931
1932Omega 0.9.1 (2005-06-06):
1933
1934* Releases are now created using libtool 1.5.18 and automake 1.9.5.
1935
1936* Updated RPM packaging.
1937
1938Omega 0.9.0 (2005-05-13):
1939
1940* Updated for 0.9.0 API changes.
1941
1942* omindex/scriptindex: Generate terms like "c#".
1943
1944* Added mbox2omega script which allows a mail folder to be indexed using
1945  scriptindex.  Mostly it's an example as there's no mechanism included to show
1946  the full original message.
1947
1948omega:
1949
1950* The configuration file is now looked for differently - you can now set
1951  the environmental variable OMEGA_CONFIG_FILE.  See docs/overview.txt for
1952  details.
1953
1954* $highlight can now highlight terms like "C#".
1955
1956* Add new template 'opensearch' to implement basic opensearch feeds of search
1957  results.
1958
1959omindex:
1960
1961* URL hashing previously depended on sizeof(long) so databases weren't totally
1962  portable between platforms.  This is now fixed, but to do so we've had to
1963  break compatibility with databases built on platforms with 64 bit longs
1964  with URLs > 228 bytes.
1965
1966* Removed useless "DUPE_duplicate" option.
1967
1968* Added support for indexing Perl "pod" documentation using pod2text.
1969
1970* Replaced -l/--no-recurse with -l/--depth-limit which takes an argument
1971  allowing recursion to be restriction to any depth, not just 0 or infinity!
1972
1973* Extend -M/--mime-type to allow an existing mapping to be removed by omitting
1974  the type.
1975
1976* Fixed code so that we get lstat() prototype on Linux systems where we have
1977  posix_fadvise().
1978
1979scriptindex:
1980
1981* Improved handling of extra blank lines in dump file.
1982
1983* Strip multiple \r characters from end of line.
1984
1985* Complain if a dump file doesn't appear to have been = escaped correctly.
1986
1987* Flush database after each input file to ensure all changes from a file
1988  make it in.
1989
1990documentation:
1991
1992* docs/omegascript.txt: Clarify $field description slightly.
1993
1994* docs/cgiparams.txt,docs/omegascript.txt: Fixed 3 references to OmXxxx classes.
1995
1996* docs/termprefixes.txt: Added a single document covering all aspects of term
1997  prefixes.
1998
1999* docs/omegascript.txt: Moved $collapsed into correct place alphabetically!
2000
2001* docs/cgiparams.txt,docs/overview.txt: Improved description of how B filters
2002  are handled when building the query.
2003
2004* docs/scriptindex.txt: Note that actions are applied in the specified order.
2005
2006Omega 0.8.5 (2004-12-23):
2007
2008* README,INSTALL: Proper installation instructions.
2009
2010* omega: If an exception is thrown, make sure that the HTTP headers
2011  get written so that we don't cause "500 Internal Server Error".
2012  This problem was introduced by the change to allow a user specified
2013  Content-Type in 0.8.0.  Partly addresses bug#60.
2014
2015* scriptindex: Fixed "Unknown Exception" when trying to "unhtml" text which
2016  contains "</body>" (bug#61).  This bug was introduced in 0.8.4.
2017
2018* omindex/scriptindex: <h1> - <h6> and </h1> - </h6> now leave a space in the
2019  dumped HTML.  This bug was introduced in 0.8.4 - before that any tag left
2020  a space in the dumped HTML.
2021
2022* omindex: Only try to delete removed documents in "replace duplicates" mode
2023  (which is the default).
2024
2025* omindex: Change behaviour of crawler such that it doesn't follow symbolic
2026  links any more.  The new "--follow" command line option turns following of
2027  symlinks back on.
2028
2029* dbi2omega: Add a comment to the start of the file detailing what
2030  dbi2omega does.
2031
2032Omega 0.8.4 (2004-12-08):
2033
2034* omindex,scriptindex: Improved HTML to text conversion - now we strip
2035  leading and trailing whitespace and convert all other consecutive groups of
2036  whitespace to a single space.  Also the parser now knows that some tags
2037  should be regarded as word breaks and some shouldn't (previously all tags
2038  were treated as word breaks).
2039
2040* omindex: Removed bogus extra line from code which was meant to
2041  truncate samples, titles, etc at a word boundary, but has never actually
2042  worked!
2043
2044* omindex: Added hooks for indexing the following formats: OpenOffice (requires
2045  unzip), MS Word (requires antiword), Wordperfect (requires wpd2text), RTF
2046  (requires unrtf).
2047
2048* omindex: If a filename to be passed to a filter program has a leading "-",
2049  protect it from possible interpretation as an option by prepending "./".
2050
2051* omega: When there's only a boolean query we promote it to be the query.
2052  Tweaked so we use boolean weights in this case.
2053
2054* omega: Use Query::empty() instead of the now deprecated Query::is_empty().
2055
2056* omega,omindex,scriptindex: Use the new Database/WritableDatabase
2057  constructors.
2058
2059* templates/godmode: Finished off godmode template.
2060
2061* Compile everything as C++.
2062
2063* Check snprintf actually works - some older versions don't implement C90
2064  snprintf semantics.
2065
2066* XAPIAN_FLAGS already links with xapianqueryparser so remove
2067  -lxapianqueryparser from omega_LDADD as it was causing link errors on cygwin.
2068
2069Omega 0.8.3 (2004-09-20):
2070
2071* scriptindex: --version now actually reports the version.  --help now exits
2072  with status 0 rather than status 1.
2073
2074* RPM packaging: Updated.  The most notable change is that the RPM is now
2075  called xapian-omega because there's already an omega RPM (in Fedora Core at
2076  least) which is a game.  Also htdig2omega and htdig2omega.script are now
2077  included in the RPM.
2078
2079* Install htdig2omega.script in ${prefix}/share/omega/ rather than
2080  ${prefix}/share/.
2081
2082Omega 0.8.2 (2004-09-13):
2083
2084* omega: $highlight now handles accented characters (bug#9).
2085
2086* omega: Use new checkatleast parameter to Enquire::get_mset to implement
2087  MINHITS.
2088
2089* omindex: When running with "replace duplicates" mode (the default), detect
2090  documents removed since the last indexing run and delete them from the
2091  database (bug #34).
2092
2093* omindex: Use the new WritableDatabase::replace_document(term, doc) method.
2094
2095* scriptindex: Report index script file name and line number when
2096  reporting errors in it.  Added warning for redundant actions,
2097  such as "truncate" as the last action in a rule.
2098
2099* templates/query: Always report if the database is not found - previously we
2100  only did so if there was a query.
2101
2102* templates/query: Fixed missing </center> tag which happened in certain cases.
2103
2104* docs/omegascript.txt: Added note about that $add{$hit,1} gives
2105  the "hit number".
2106
2107* Now includes htdig2omega and htdig2omega.script which allow you to crawl
2108  remote websites with ht://dig, then build a searchable index of them with
2109  Xapian and Omega.
2110
2111* Link with -lxapianqueryparser, not -lomqueryparser.
2112
2113Omega 0.8.1 (2004-06-30):
2114
2115* omindex: Renamed hash() to hash_string() to avoid colliding with something
2116  on IRIX.
2117
2118* omega: Changed MORELIKE to pick up to 40 terms, rather than up to 6 (feedback
2119  on the mailing list suggests this gives much better results).
2120
2121* scriptindex: Added explicit catch for std::bad_alloc.
2122
2123Omega 0.8.0 (2004-04-19):
2124
2125* scriptindex: Change default to *not* overwriting the database (use
2126  --overwrite if you really want to do this); -u is now accepted but ignored.
2127
2128* scriptindex: Use getopt for option parsing.
2129
2130* omindex: Added --overwrite option which forces an existing database to be
2131  deleted before indexing begins.
2132
2133* templates/xml: Correct spelling of `relavence' to `relevance'.  NB: if you're
2134  parsing the XML output, you'll need to fix this spelling in your parser!
2135
2136* templates/xml: Now set HTTP header: "Content-Type: application/html".
2137
2138* templates/xml: Remove unused OmegaScript code:
2139  `$set{topterms,$or{$ne{$msize,0},$query}}'.
2140
2141* indextext.cc,omindex.cc,scriptindex.cc: Updated to use add_term() instead of
2142  add_term_nopos().
2143
2144* omega: Added $httpheader Omegascript to allow arbitrary HTTP headers and
2145  alternative Content-Type headers to be specified.
2146
2147* omega: If the probabilistic query was bad, don't try to run the match.
2148
2149* omega: Don't crash if there's a date filter but no probabilistic query.
2150
2151* omindex/scriptindex: Raw terms with a multicharacter prefix are now indexed
2152  with a : inserted (e.g. as XFOO:Rterm).  This matches what the query parser
2153  does.
2154
2155* omindex/scriptindex: Don't create R terms for terms which start with a digit.
2156
2157* omindex: Use O_STREAMING and/or posix_fadvise() when reading files to be
2158  indexed (if available).  This helps to keep the Xapian database in cache,
2159  and should greatly improve indexing throughput.
2160
2161* docs/scriptindex.txt: Make more explicit that boolean produces a *single*
2162  boolean term.
2163
2164* docs/cgiparams.txt: Note that START and END should be in the format YYYYMMDD.
2165
2166For NEWS entries for Omega versions prior to 0.8.0, see the xapian-core NEWS
2167file.
2168