1[[faq]]
2= Frequently Asked Questions
3
4[partintro]
5--
6This section will be updated as more frequently asked questions arise
7
8* <<faq_doc_error,How can I report an error in the documentation?>>
9* <<faq_partial_delete,Can I delete only certain data from within indices?>>
10* <<faq_strange_chars,Can Curator handle index names with strange characters?>>
11* <<entrypoint-fix,I'm getting `DistributionNotFound` and `entry_point` errors when I try to run Curator.  What am I doing wrong?>>
12* <<faq_unicode,Why am I getting an error message about ASCII encoding?>>
13--
14
15[[faq_doc_error]]
16== Q: How can I report an error in the documentation?
17
18=== A: Use the "Edit" link on any page
19
20See <<site-corrections,Site Corrections>>.
21
22[[faq_partial_delete]]
23== Q: Can I delete only certain data from within indices?
24
25=== A: It's complicated
26
27[float]
28TL;DR: No. Curator can only delete entire indices.
29^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30
31[float]
32Full answer:
33^^^^^^^^^^^^
34
35As a thought exercise, think of Elasticsearch indices as being like databases,
36or tablespaces within a database. If you had hundreds of millions of rows to
37delete from your database, would you run a separate
38`DELETE from TABLE where date<YYYY.MM.dd` to assemble hundreds of millions of
39individual delete operations every day, or would you partition your tables in a
40way that you could simply run `DROP table TABLENAME.YYYY.MM.dd`? The strain on
41your database would be astronomical on the former and next to nothing on the
42latter. Elasticsearch works much the same way. While Elasticsearch _can_
43technically do both methods, for use-cases with time-series data (like logging),
44we recommend dropping entire indices vs. the extremely I/O expensive search and
45delete method. Curator was created to help fill that need.
46
47While you can store different types within different indices (e.g.
48syslog-2014.05.05, apache-2015.05.06), this gets very expensive, very quickly in
49a totally different way. Each shard in Elasticsearch is a Lucene index. Each
50index requires a portion of the heap to exist and be kept current. If you have 3
51daily indices with 5 primary shards each, you suddenly have reduced the
52available heap space for shard management by a factor of 3, having gone from 5
53shards to 15, __per index,__ not counting multiple indexes per day. The ways to
54mitigate this (if you pursue this route) include massive daily indexing boxes
55and using shard allocation/routing to move indices to specific members of the
56cluster where they can have less effect; keeping fewer days of information;
57having more nodes in your cluster, and so forth.
58
59[float]
60Conclusion:
61^^^^^^^^^^^
62
63While it may be desirable to have different life-cycles for your data, sometimes
64it's just easier and cheaper to store everything as long as the longest
65life-cycle you wish to maintain.
66
67[float]
68Post-script:
69^^^^^^^^^^^^
70
71Even though it is neither recommended footnote:[There are reasons Elasticsearch does not recommend this, particularly for time-series data. For more information read http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html and watch what happens to your segments when you delete data.],
72nor best practices, it is still possible to perform these search & delete
73operations yourself, using the {ref}/docs-delete-by-query.html[Delete-by-Query
74API]. Curator will not be modified to perform operations such as these, however.
75Curator is meant to manage at the index level, rather than the data level.
76
77'''''
78
79[[faq_strange_chars]]
80== Q: Can Curator handle index names with strange characters?
81
82=== A: Yes!
83
84This problem can be resolved by using the
85<<filtertype_pattern,pattern filtertype>> with <<fe_kind,kind>> set to `regex`,
86and <<fe_value,value>> set to the needed regular expression.
87
88[float]
89The Problem:
90^^^^^^^^^^^^
91
92Illegal characters make it hard to delete indices.
93
94------------------
95% curl logs.example.com:9200/_cat/indices
96red    }?ebc-2015.04.08.03
97                          sip-request{ 5 1         0  0     632b     316b
98red    }?ebc-2015.04.08.03
99                          sip-response 5 1         0  0     474b     237b
100red    ?ebc-2015.04.08.02
101                         sip-request{ 5 1         0  0     474b     316b
102red
103eb                               5 1         0  0     632b     316b
104red    ?e                                5 1         0  0     632b     316b
105------------------
106
107&nbsp;
108
109You can see it looks like there are some tab characters and maybe newline
110characters. This makes it hard to use the HTTP API to delete the indices.
111
112Dumping all the index settings out:
113
114[source,sh]
115-------
116curl -XGET localhost:9200/*/_settings?pretty
117-------
118
119&nbsp;
120
121...reveals the index names as the first key in the resulting JSON.  In this
122case, the names were very atypical:
123
124-------
125}\b?\u0011ebc-2015.04.08.02\u000Bsip-request{
126}\u0006?\u0011ebc-2015.04.08.03\u000Bsip-request{
127}\u0003?\u0011ebc-2015.04.08.03\fsip-response
128...
129-------
130
131&nbsp;
132
133Curator lets you use regular expressions to select indices to perform actions
134on.
135
136WARNING: Before attempting an action, see what will be affected by using the
137`--dry-run` flag first.
138
139To delete the first three from the above example, use `'.*sip.*'` as your
140regular expression.
141
142NOTE: In an <<actionfile,actionfile>>, regular expressions and strftime date
143strings _must_ be encapsulated in single-quotes.
144
145The next one is trickier. The real name of the index was `\n\u0011eb`. The
146regular expression `.*b$` did not work, but `'\n.*'` did.
147
148The last index can be deleted with a regular expression of `'.*e$'`.
149
150The resulting <<actionfile,actionfile>> might look like this:
151
152[source,yaml]
153--------
154actions:
155  1:
156    description: Delete indices with strange characters that match regex '.*sip.*'
157    action: delete_indices
158    options:
159      continue_if_exception: False
160      disable_action: False
161    filters:
162    - filtertype: pattern
163      kind: regex
164      value: '.*sip.*'
165  2:
166    description: Delete indices with strange characters that match regex '\n.*'
167    action: delete_indices
168    options:
169      continue_if_exception: False
170      disable_action: False
171    filters:
172    - filtertype: pattern
173      kind: regex
174      value: '\n.*'
175  3:
176    description: Delete indices with strange characters that match regex '.*e$'
177    action: delete_indices
178    options:
179      continue_if_exception: False
180      disable_action: False
181    filters:
182    - filtertype: pattern
183      kind: regex
184      value: '.*e$'
185--------
186
187&nbsp;
188
189'''''
190
191[[entrypoint-fix]]
192== Q: I'm getting `DistributionNotFound` and `entry_point` errors when I try to run Curator.  What am I doing wrong?
193
194=== A: You likely need to upgrade `setuptools`
195
196If you are still unable to install, or get strange errors about dependencies you
197know you've installed, or messages mentioning `entry_point`, you may need to
198upgrade the `setuptools` package.  This is especially common with RHEL and
199CentOS installs, and their variants, as they depend on Python 2.6.
200
201If you can run `pip install -U setuptools`, it should correct the problem.
202
203You may also be able to download and install manually:
204
205. `wget https://pypi.python.org/packages/source/s/setuptools/setuptools-15.1.tar.gz -O setuptools-15.1.tar.gz`
206. `pip install setuptools-15.1.tar.gz`
207
208Any dependencies this version of setuptools may require will have to be manually
209acquired and installed for your platform.
210
211For more information about setuptools, see https://pypi.python.org/pypi/setuptools
212
213This fix originally appeared https://github.com/elastic/curator/issues/56#issuecomment-77843587[here].
214
215'''''
216
217[[faq_unicode]]
218== Q: Why am I getting an error message about ASCII encoding?
219
220=== A: You need to change your encoding to UTF-8
221
222If you see messages like this:
223
224[source,sh]
225-----------
226Click will abort further execution because Python 3 was configured to use ASCII
227as encoding for the environment.  Either run this under Python 2 or consult
228http://click.pocoo.org/python3/ for mitigation steps.
229
230This system lists a couple of UTF-8 supporting locales that
231you can pick from.  The following suitable locales where
232discovered: aa_DJ.utf8, aa_ER.utf8, aa_ET.utf8, ...
233-----------
234
235You are likely running Curator with Python 3, or the RPM/DEB package, which was
236compiled with Python 3.  Using the command-line library
237http://click.pocoo.org[click] with Python 3 requires your locale to be Unicode.
238You can set this up by exporting the `LC_ALL` environment variable like this:
239
240[source,sh]
241-----------
242$ export LC_ALL=mylocale.utf8
243-----------
244
245Where `mylocale.utf8` is one of the listed "suitable locales."
246
247You can also set the locale on the command-line before the Curator command:
248
249[source,sh]
250-----------
251$ LC_ALL=mylocale.utf8 curator [ARGS] ...
252-----------
253
254IMPORTANT: If you use `export`, be sure to choose the correct locale as it will
255be set for the duration of your terminal session.
256
257'''''
258