1.. cyrusman:: squatter(8) 2 3.. author: Nic Bernstein (Onlight) 4 5.. _imap-reference-manpages-systemcommands-squatter: 6 7============ 8**squatter** 9============ 10 11Create SQUAT and Xapian indexes for mailboxes 12 13Synopsis 14======== 15 16.. parsed-literal:: 17 18 general: 19 **squatter** [ **-C** *config-file* ] [**mode**] [**options**] [**source**] 20 21 i.e.: 22 **squatter** [ **-C** *config-file* ] [ **-v** ] [ **-a** ] [ **-S** *seconds* ] [ **-Z** ] 23 **squatter** [ **-C** *config-file* ] [ **-v** ] [ **-a** ] [ **-i** ] [ **-N** *name* ] [ **-S** *seconds* ] [ **-r** ] [ **-Z** ] *mailbox*... 24 **squatter** [ **-C** *config-file* ] [ **-v** ] [ **-a** ] [ **-i** ] [ **-N** *name* ] [ **-S** *seconds* ] [ **-r** ] [ **-Z** ] **-u** *user*... 25 **squatter** [ **-C** *config-file* ] [ **-v** ] [ **-a** ] **-R** [ **-n** *channel* ] [ **-d** ] [ **-S** *seconds* ] [ **-Z** ] 26 **squatter** [ **-C** *config-file* ] [ **-v** ] [ **-a** ] **-f** *synclogfile* [ **-S** *seconds* ] [ **-Z** ] 27 **squatter** [ **-C** *config-file* ] [ **-v** ] **-t** *srctier(s)*... **-z** *desttier* [ **-B** ] [ **-F** ] [ **-U** ] [ **-T** *reindextiers* ] [ **-X** ] [ **-o** ] [ **-S** *seconds* ] [ **-u** *user*... ] 28 29 30 31Description 32=========== 33 34.. Note:: 35 The name "**squatter**" once referred both to the SQUAT indexing 36 engine and to the command used to create indexes. Now that Cyrus 37 supports more than one index type -- SQUAT and Xapian, as of this 38 writing -- the name "**squatter**" refers to the command used to 39 control index creation. The terms "SQUAT" or "SQUAT index(es)" 40 refers to the indexes used by the older SQUAT indexing engine. 41 Post v3 the *search_engine* setting in *imapd.conf* determines 42 which search engine is used. 43 44**squatter** creates a new text index for one or more IMAP mailboxes. 45The index is a unified index of all of the header and body text 46of each message in a given mailbox. This index is used to significantly 47reduce IMAP SEARCH times on a mailbox. 48 49**mode** is one of indexer, search, rolling, synclog, compact or audit. 50 51By default, **squatter** creates an index of ALL messages in the 52mailbox, not just those since the last time that it was run. The 53**-i** option is used to select incremental updates. Any messages 54appended to the mailbox after **squatter** is run, will NOT be included 55in the index. To include new messages in the index, **squatter** must 56be run again, or on a regular basis via crontab, an entry in the EVENTS 57section of :cyrusman:`cyrus.conf(5)` or use *rolling* mode (**-R**). 58 59In the first synopsis, **squatter** indexes all mailboxes. 60 61In the second synopsis, **squatter** indexes the specified mailbox(es). 62The mailboxes are space-separated. 63 64In the third synopsis, **squatter** indexes the specified user(s) 65mailbox(es). 66 67For the latter two index modes (mailbox, user) one 68may optionally specify **-r** to recurse from the specified start, or 69**-a** to limit action only to mailboxes which have the shared 70*/vendor/cmu/cyrus-imapd/squat* annotation set to "true". 71 72In the fourth synopsis, **squatter** runs in rolling mode. In this 73mode **squatter** backgrounds itself and runs as a daemon (unless 74**-d** is set), listening to a sync log channel chosen using the **-n** 75option, and set up using the *sync_log_channels* setting in 76:cyrusman:`imapd.conf(5)`. Very soon after messages are delivered or 77uploaded to mailboxes **squatter** will incrementally index the 78affected mailbox (see notes, below). 79 80In the fifth synopsis, **squatter** reads a single sync log file and 81performs incremental indexing on the mailbox(es) listed therein. This 82is sometimes useful for cleaning up after problems with rolling mode. 83 84In the sixth synopsis, **squatter** will compact indices from 85*srctier(s)* to *desttier*, optionally reindexing (**-X**) or filtering 86expunged records (**-F**) in the process. The optional **-T** flag may be 87used to specify members of srctiers which must be reindexed. These files are 88eventually copied with `rsync -a` and then removed by `rm`. 89`rsync` can increase the load average of the system, especially when the 90temporary directory is on `tmpfs`. To throttle `rsync` it is possible to 91modify the call in `imap/search_xapian.c` and pass `-\\-bwlimit=<number>` as further 92parameter. The **-o** flag may be used to direct that a single index be 93copied, rather than compacted, from *srctier* to *desttier*. The **-u** flag 94may be used to restrict operation to the specified user(s). 95 96For all modes, the **-S** option may be specified, causing **squatter** to 97pause *seconds* seconds after each mailbox, to smooth loads. 98 99When using the Xapian engine the **-Z** option may be specified, for 100the indexing modes. This tells **squatter** to consult the Xapian 101internally indexed GUIDs, rather than relying on what's stored in 102`cyrus.indexed.db`, allowing for recovery from broken 103`cyrus.indexed.db` at the sacrifice of efficiency. 104 105.. Note:: 106 Incremental updates are very inefficient with the SQUAT search 107 engine. If using SQUAT for large and active mailboxes, you should 108 run **squatter** periodically as an EVENT in ``cyrus.conf(5)``. 109 110.. Note:: 111 Messages and mailboxes that have not been indexed CAN still be 112 SEARCHed, just not as quickly as those with an index. 113 114**squatter** |default-conf-text| 115 116Options 117======= 118 119.. program:: squatter 120 121.. option:: -C config-file 122 123 |cli-dash-c-text| 124 125.. option:: -a, --squat-annot 126 127 Only create indexes for mailboxes which have the shared 128 */vendor/cmu/cyrus-imapd/squat* annotation set to "true". 129 130 The value of the */vendor/cmu/cyrus-imapd/squat* annotation is 131 inherited by all children of the given mailbox, so an entire 132 mailbox tree can be indexed (or not indexed) by setting a single 133 annotation on the root of that tree with a value of "true" (or 134 "false"). If a mailbox does not have a 135 */vendor/cmu/cyrus-imapd/squat* annotation set on it (or does not 136 inherit one), then the mailbox is not indexed. In other words, the 137 implicit value of */vendor/cmu/cyrus-imapd/squat* is "false". 138 139.. option:: -A, --audit 140 141 Audits the specified mailboxes (or all), reports any unindexed messages. 142 |master-new-feature| 143 144.. option:: -d, --nodaemon 145 146 In rolling mode, don't background and do emit log messages on 147 standard error. Useful for debugging. 148 |v3-new-feature| 149 150.. option:: -B, --skip-locked 151 152 In compact mode, use non-blocking lock to start and skip any 153 users who have their xapianactive file locked at the time (i.e 154 another reindex task) 155 |master-new-feature| 156 157.. option:: -F, --filter 158 159 In compact mode, filter the resulting database to only include 160 messages which are not expunged in mailboxes with existing 161 name/uidvalidity. 162 |v3-new-feature| 163 164.. option:: -f synclogfile, --synclog=synclogfile 165 166 Read the *synclogfile* and incrementally index all the mailboxes 167 listed therein, then exit. 168 |v3-new-feature| 169 170.. option:: -h, --help 171 172 Display this usage information. 173 174.. option:: -i, --incremental 175 176 Incremental updates where indexes already exist. 177 178.. option:: -N name, --name=name 179 180 Only index mailboxes beginning with *name* while iterating through 181 the mailbox list derived from other options. 182 183.. option:: -n channel, --channel=channel 184 185 In rolling mode, specify the name of the sync log *channel* that 186 **squatter** will listen to. The default is "squatter". This 187 channel **must** be defined in :cyrusman:`imapd.conf(5)` before 188 being used. 189 |v3-new-feature| 190 191.. option:: -o, --copydb 192 193 In compact mode, if only one source database is selected, just copy 194 it to the destination rather than compacting. 195 |v3-new-feature| 196 197.. option:: -p, --allow-partials 198 199 When indexing, allow messages to be partially indexed. This may 200 occur if attachment indexing is enabled but indexing failed for 201 one or more attachment body parts. If this flag is set, the 202 message is partially indexed and squatter continues. Otherwise 203 squatter aborts with an error. Also see **-P**. 204 Xapian only. 205 |master-new-feature| 206 207 .. option:: -P, --reindex-partials 208 209 When reindexing, then attempt to reindex any partially indexed 210 messages (see **-p**). Setting this flag implies **-Z**. 211 Xapian only. 212 |master-new-feature| 213 214 .. option:: -L, --reindex-minlevel=level 215 216 When reindexing, index all messages that have an index level 217 less than level. Currently, Cyrus only supports two index levels: 218 A message for which attachment indexing was never attempted has 219 index level 1. A message that has indexed attachments, or does not 220 contain attachments, has index level 3. Consequently, running 221 squatter with minlevel set to 3 will cause it to attempt reindexing 222 all messages, for which attachment indexing never was attempted. 223 Future Cyrus versions may introduce additional levels. Setting 224 this flag implies **-Z**. 225 Xapian only. 226 |master-new-feature| 227 228.. option:: -R, --rolling 229 230 Run in rolling mode; **squatter** runs as a daemon listening to a 231 sync log channel and continuously incrementally indexing mailboxes. 232 See also **-d** and **-n**. 233 |v3-new-feature| 234 235.. option:: -r, --recursive 236 237 Recursively create indexes for all sub-mailboxes of the user, 238 mailboxes or mailbox prefixes given as arguments. 239 240.. option:: -s, --squat-skip[=delta] 241 242 Skip mailboxes that have not been modified since last index. This is 243 achieved by comparing the last modification time of a mailbox to 244 the last time the squat index of this mailbox got updated. If the 245 mailbox modification time (plus delta) is less than the squat 246 index modification time, then the mailbox is skipped. The optional 247 argument value delta is defined in seconds and must be equal to or 248 higher than zero, the default value is 60. 249 Squat only. 250 |master-new-feature| 251 252.. option:: -S seconds, --sleep=seconds 253 254 After processing each mailbox, sleep for "seconds" before 255 continuing. Can be used to provide some load balancing. Accepts 256 fractional amounts. |v3-new-feature| 257 258.. option:: -T reindextiers, --reindex-tier=reindextiers 259 260 In compact mode, a comma-separated subset of the source tiers 261 (see **-t**) to be reindexed. Similar to **-X** but allows 262 limiting the tiers that will be reindexed. 263 |v3-new-feature| 264 265.. option:: -t srctiers, --srctier=srctiers 266 267 In compact mode, the comma-separated source tier(s) for the compacted 268 indices. At least one source tier must be specified in compact mode. 269 Xapian only. 270 |v3-new-feature| 271 272.. option:: -u name, --user=name 273 274 Extra options refer to usernames (e.g. foo@bar.com) rather than 275 mailbox names. Usernames are space-separated. 276 |v3-new-feature| 277 278.. option:: -U, --only-upgrade 279 280 In compact mode, only compact if re-indexing. 281 Xapian only. 282 |master-new-feature| 283 284.. option:: -v, --verbose 285 286 Increase the verbosity of progress/status messages. Sometimes additional messages 287 are emitted on the terminal with this option and the messages are unconditionally sent 288 to syslog. Sometimes messages are sent to syslog, only if -v is provided. In rolling and 289 synclog modes, -vv sends even more messages to syslog. 290 291.. option:: -X, --reindex 292 293 Reindex all the messages before compacting. This mode reads all 294 the lists of messages indexed by the listed tiers, and re-indexes 295 them into a temporary database before compacting that into place. 296 Xapian only. 297 |v3-new-feature| 298 299.. option:: -z desttier, --compact=desttier 300 301 In compact mode, the destination tier for the compacted indices. 302 This must be specified in compact mode. 303 Xapian only. 304 |v3-new-feature| 305 306.. option:: -Z, --internalindex 307 308 When indexing messages, use the Xapian internal cyrusid rather than 309 referencing the ranges of already indexed messages to know if a 310 particular message is indexed. Useful if the ranges get out of 311 sync with the actual messages (e.g. if files on a tier are lost) 312 Xapian only. 313 |master-new-feature| 314 315Examples 316======== 317 318**squatter** is typically deployed via entries in 319:cyrusman:`cyrus.conf(5)`, in either the DAEMON or EVENTS sections. 320 321For the older SQUAT search engine, which offers poor performance in 322rolling mode (-R) we recommend triggering periodic runs via entries in 323the EVENTS section, as follows: 324 325Sample entries from the EVENTS section of :cyrusman:`cyrus.conf(5)` for 326periodic **squatter** runs: 327 328 :: 329 330 EVENTS { 331 # reindex changed mailboxes (fulltext) approximately every three hours 332 squatter1 cmd="/usr/bin/ionice -c idle /usr/lib/cyrus/bin/squatter -i" period=180 333 334 # reindex all mailboxes (fulltext) daily 335 squattera cmd="/usr/lib/cyrus/bin/squatter" at=0117 336 } 337 338For the newer Xapian search engine, and with sufficiently fast storage, 339the rolling mode (-R) offers advantages. Use of rolling mode requires 340that **squatter** be invoked in the DAEMON section. 341 342Sample entries for the DAEMON section of :cyrusman:`cyrus.conf(5)` for 343rolling **squatter** operation: 344 345 :: 346 347 DAEMON { 348 # run a rolling squatter using the default sync_log channel "squatter" 349 squatter cmd="squatter -R" 350 351 # run a rolling squatter using a specific sync_log channel 352 squatter cmd="squatter -R -n indexer" 353 } 354 355.. Note:: 356 357 When using the *-R* rolling mode, you MUST enable sync_log 358 operation in :cyrusman:`imapd.conf(5)` via the `sync_log: on` 359 setting, and MUST define a sync_log channel via the 360 `sync_log_channels:` setting. If also using replication, you must 361 either explicitly specify your replication sync_log channel via the 362 `sync_log_channels` directive with a name, or specify the default 363 empty name with "" (the two-character string U+22 U+22). [Please 364 see :cyrusman:`imapd.conf(5)` for details]. 365 366.. Note:: 367 368 When configuring rolling search indexing on a **replica**, one must 369 consider whether sync_logs will be written at all. In this case, 370 please consider the setting `sync_log_unsuppressable_channels` to 371 ensure that the sync_log channel upon which one's squatter instance 372 depends will continue to be written. See :cyrusman:`imapd.conf(5)` 373 for details. 374 375.. Note:: 376 377 When using the Xapian search engine, you must define various 378 settings in :cyrusman:`imapd.conf(5)`. Please read all relevant 379 Xapian documentation in this release before using Xapian. 380 381[NB: More examples needed] 382 383History 384======= 385 386Support for additional search engines was added in version 3.0. 387 388The following command-line switches were added in version 3.0: 389 390 .. parsed-literal:: 391 392 **-F -R -X -d -f -o -u** 393 394The following command-line settings were added in version 3.0: 395 396 .. parsed-literal:: 397 398 **-S** *<seconds>*, **-T** *<directory>*, **-f** *<synclogfile>*, **-n** *<channel>*, **-t** *srctier*..., **-z** *desttier* 399 400Files 401===== 402 403/etc/imapd.conf, 404/etc/cyrus.conf 405 406See Also 407======== 408 409:cyrusman:`imapd.conf(5)`, :cyrusman:`cyrus.conf(5)` 410 411.. only:: html 412 413 :ref:`configuring-xapian` 414