1[[faq]] 2= Frequently Asked Questions 3 4[partintro] 5-- 6This section will be updated as more frequently asked questions arise 7 8* <<faq_doc_error,How can I report an error in the documentation?>> 9* <<faq_partial_delete,Can I delete only certain data from within indices?>> 10* <<faq_strange_chars,Can Curator handle index names with strange characters?>> 11* <<entrypoint-fix,I'm getting `DistributionNotFound` and `entry_point` errors when I try to run Curator. What am I doing wrong?>> 12* <<faq_unicode,Why am I getting an error message about ASCII encoding?>> 13-- 14 15[[faq_doc_error]] 16== Q: How can I report an error in the documentation? 17 18=== A: Use the "Edit" link on any page 19 20See <<site-corrections,Site Corrections>>. 21 22[[faq_partial_delete]] 23== Q: Can I delete only certain data from within indices? 24 25=== A: It's complicated 26 27[float] 28TL;DR: No. Curator can only delete entire indices. 29^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 30 31[float] 32Full answer: 33^^^^^^^^^^^^ 34 35As a thought exercise, think of Elasticsearch indices as being like databases, 36or tablespaces within a database. If you had hundreds of millions of rows to 37delete from your database, would you run a separate 38`DELETE from TABLE where date<YYYY.MM.dd` to assemble hundreds of millions of 39individual delete operations every day, or would you partition your tables in a 40way that you could simply run `DROP table TABLENAME.YYYY.MM.dd`? The strain on 41your database would be astronomical on the former and next to nothing on the 42latter. Elasticsearch works much the same way. While Elasticsearch _can_ 43technically do both methods, for use-cases with time-series data (like logging), 44we recommend dropping entire indices vs. the extremely I/O expensive search and 45delete method. Curator was created to help fill that need. 46 47While you can store different types within different indices (e.g. 48syslog-2014.05.05, apache-2015.05.06), this gets very expensive, very quickly in 49a totally different way. Each shard in Elasticsearch is a Lucene index. Each 50index requires a portion of the heap to exist and be kept current. If you have 3 51daily indices with 5 primary shards each, you suddenly have reduced the 52available heap space for shard management by a factor of 3, having gone from 5 53shards to 15, __per index,__ not counting multiple indexes per day. The ways to 54mitigate this (if you pursue this route) include massive daily indexing boxes 55and using shard allocation/routing to move indices to specific members of the 56cluster where they can have less effect; keeping fewer days of information; 57having more nodes in your cluster, and so forth. 58 59[float] 60Conclusion: 61^^^^^^^^^^^ 62 63While it may be desirable to have different life-cycles for your data, sometimes 64it's just easier and cheaper to store everything as long as the longest 65life-cycle you wish to maintain. 66 67[float] 68Post-script: 69^^^^^^^^^^^^ 70 71Even though it is neither recommended footnote:[There are reasons Elasticsearch does not recommend this, particularly for time-series data. For more information read http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html and watch what happens to your segments when you delete data.], 72nor best practices, it is still possible to perform these search & delete 73operations yourself, using the {ref}/docs-delete-by-query.html[Delete-by-Query 74API]. Curator will not be modified to perform operations such as these, however. 75Curator is meant to manage at the index level, rather than the data level. 76 77''''' 78 79[[faq_strange_chars]] 80== Q: Can Curator handle index names with strange characters? 81 82=== A: Yes! 83 84This problem can be resolved by using the 85<<filtertype_pattern,pattern filtertype>> with <<fe_kind,kind>> set to `regex`, 86and <<fe_value,value>> set to the needed regular expression. 87 88[float] 89The Problem: 90^^^^^^^^^^^^ 91 92Illegal characters make it hard to delete indices. 93 94------------------ 95% curl logs.example.com:9200/_cat/indices 96red }?ebc-2015.04.08.03 97 sip-request{ 5 1 0 0 632b 316b 98red }?ebc-2015.04.08.03 99 sip-response 5 1 0 0 474b 237b 100red ?ebc-2015.04.08.02 101 sip-request{ 5 1 0 0 474b 316b 102red 103eb 5 1 0 0 632b 316b 104red ?e 5 1 0 0 632b 316b 105------------------ 106 107 108 109You can see it looks like there are some tab characters and maybe newline 110characters. This makes it hard to use the HTTP API to delete the indices. 111 112Dumping all the index settings out: 113 114[source,sh] 115------- 116curl -XGET localhost:9200/*/_settings?pretty 117------- 118 119 120 121...reveals the index names as the first key in the resulting JSON. In this 122case, the names were very atypical: 123 124------- 125}\b?\u0011ebc-2015.04.08.02\u000Bsip-request{ 126}\u0006?\u0011ebc-2015.04.08.03\u000Bsip-request{ 127}\u0003?\u0011ebc-2015.04.08.03\fsip-response 128... 129------- 130 131 132 133Curator lets you use regular expressions to select indices to perform actions 134on. 135 136WARNING: Before attempting an action, see what will be affected by using the 137`--dry-run` flag first. 138 139To delete the first three from the above example, use `'.*sip.*'` as your 140regular expression. 141 142NOTE: In an <<actionfile,actionfile>>, regular expressions and strftime date 143strings _must_ be encapsulated in single-quotes. 144 145The next one is trickier. The real name of the index was `\n\u0011eb`. The 146regular expression `.*b$` did not work, but `'\n.*'` did. 147 148The last index can be deleted with a regular expression of `'.*e$'`. 149 150The resulting <<actionfile,actionfile>> might look like this: 151 152[source,yaml] 153-------- 154actions: 155 1: 156 description: Delete indices with strange characters that match regex '.*sip.*' 157 action: delete_indices 158 options: 159 continue_if_exception: False 160 disable_action: False 161 filters: 162 - filtertype: pattern 163 kind: regex 164 value: '.*sip.*' 165 2: 166 description: Delete indices with strange characters that match regex '\n.*' 167 action: delete_indices 168 options: 169 continue_if_exception: False 170 disable_action: False 171 filters: 172 - filtertype: pattern 173 kind: regex 174 value: '\n.*' 175 3: 176 description: Delete indices with strange characters that match regex '.*e$' 177 action: delete_indices 178 options: 179 continue_if_exception: False 180 disable_action: False 181 filters: 182 - filtertype: pattern 183 kind: regex 184 value: '.*e$' 185-------- 186 187 188 189''''' 190 191[[entrypoint-fix]] 192== Q: I'm getting `DistributionNotFound` and `entry_point` errors when I try to run Curator. What am I doing wrong? 193 194=== A: You likely need to upgrade `setuptools` 195 196If you are still unable to install, or get strange errors about dependencies you 197know you've installed, or messages mentioning `entry_point`, you may need to 198upgrade the `setuptools` package. This is especially common with RHEL and 199CentOS installs, and their variants, as they depend on Python 2.6. 200 201If you can run `pip install -U setuptools`, it should correct the problem. 202 203You may also be able to download and install manually: 204 205. `wget https://pypi.python.org/packages/source/s/setuptools/setuptools-15.1.tar.gz -O setuptools-15.1.tar.gz` 206. `pip install setuptools-15.1.tar.gz` 207 208Any dependencies this version of setuptools may require will have to be manually 209acquired and installed for your platform. 210 211For more information about setuptools, see https://pypi.python.org/pypi/setuptools 212 213This fix originally appeared https://github.com/elastic/curator/issues/56#issuecomment-77843587[here]. 214 215''''' 216 217[[faq_unicode]] 218== Q: Why am I getting an error message about ASCII encoding? 219 220=== A: You need to change your encoding to UTF-8 221 222If you see messages like this: 223 224[source,sh] 225----------- 226Click will abort further execution because Python 3 was configured to use ASCII 227as encoding for the environment. Either run this under Python 2 or consult 228http://click.pocoo.org/python3/ for mitigation steps. 229 230This system lists a couple of UTF-8 supporting locales that 231you can pick from. The following suitable locales where 232discovered: aa_DJ.utf8, aa_ER.utf8, aa_ET.utf8, ... 233----------- 234 235You are likely running Curator with Python 3, or the RPM/DEB package, which was 236compiled with Python 3. Using the command-line library 237http://click.pocoo.org[click] with Python 3 requires your locale to be Unicode. 238You can set this up by exporting the `LC_ALL` environment variable like this: 239 240[source,sh] 241----------- 242$ export LC_ALL=mylocale.utf8 243----------- 244 245Where `mylocale.utf8` is one of the listed "suitable locales." 246 247You can also set the locale on the command-line before the Curator command: 248 249[source,sh] 250----------- 251$ LC_ALL=mylocale.utf8 curator [ARGS] ... 252----------- 253 254IMPORTANT: If you use `export`, be sure to choose the correct locale as it will 255be set for the duration of your terminal session. 256 257''''' 258