1////
2/**
3 *
4 * Licensed to the Apache Software Foundation (ASF) under one
5 * or more contributor license agreements.  See the NOTICE file
6 * distributed with this work for additional information
7 * regarding copyright ownership.  The ASF licenses this file
8 * to you under the Apache License, Version 2.0 (the
9 * "License"); you may not use this file except in compliance
10 * with the License.  You may obtain a copy of the License at
11 *
12 *     http://www.apache.org/licenses/LICENSE-2.0
13 *
14 * Unless required by applicable law or agreed to in writing, software
15 * distributed under the License is distributed on an "AS IS" BASIS,
16 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
17 * See the License for the specific language governing permissions and
18 * limitations under the License.
19 */
20////
21
22[[trouble]]
23= Troubleshooting and Debugging Apache HBase
24:doctype: book
25:numbered:
26:toc: left
27:icons: font
28:experimental:
29
30[[trouble.general]]
31== General Guidelines
32
33Always start with the master log (TODO: Which lines?). Normally it's just printing the same lines over and over again.
34If not, then there's an issue.
35Google or link:http://search-hadoop.com[search-hadoop.com] should return some hits for those exceptions you're seeing.
36
37An error rarely comes alone in Apache HBase, usually when something gets screwed up what will follow may be hundreds of exceptions and stack traces coming from all over the place.
38The best way to approach this type of problem is to walk the log up to where it all began, for example one trick with RegionServers is that they will print some metrics when aborting so grepping for _Dump_ should get you around the start of the problem.
39
40RegionServer suicides are 'normal', as this is what they do when something goes wrong.
41For example, if ulimit and max transfer threads (the two most important initial settings, see <<ulimit>> and <<dfs.datanode.max.transfer.threads>>) aren't changed, it will make it impossible at some point for DataNodes to create new threads that from the HBase point of view is seen as if HDFS was gone.
42Think about what would happen if your MySQL database was suddenly unable to access files on your local file system, well it's the same with HBase and HDFS.
43Another very common reason to see RegionServers committing seppuku is when they enter prolonged garbage collection pauses that last longer than the default ZooKeeper session timeout.
44For more information on GC pauses, see the link:http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/[3 part blog post] by Todd Lipcon and <<gcpause>> above.
45
46[[trouble.log]]
47== Logs
48
49The key process logs are as follows... (replace <user> with the user that started the service, and <hostname> for the machine name)
50
51NameNode: _$HADOOP_HOME/logs/hadoop-<user>-namenode-<hostname>.log_
52
53DataNode: _$HADOOP_HOME/logs/hadoop-<user>-datanode-<hostname>.log_
54
55JobTracker: _$HADOOP_HOME/logs/hadoop-<user>-jobtracker-<hostname>.log_
56
57TaskTracker: _$HADOOP_HOME/logs/hadoop-<user>-tasktracker-<hostname>.log_
58
59HMaster: _$HBASE_HOME/logs/hbase-<user>-master-<hostname>.log_
60
61RegionServer: _$HBASE_HOME/logs/hbase-<user>-regionserver-<hostname>.log_
62
63ZooKeeper: _TODO_
64
65[[trouble.log.locations]]
66=== Log Locations
67
68For stand-alone deployments the logs are obviously going to be on a single machine, however this is a development configuration only.
69Production deployments need to run on a cluster.
70
71[[trouble.log.locations.namenode]]
72==== NameNode
73
74The NameNode log is on the NameNode server.
75The HBase Master is typically run on the NameNode server, and well as ZooKeeper.
76
77For smaller clusters the JobTracker/ResourceManager is typically run on the NameNode server as well.
78
79[[trouble.log.locations.datanode]]
80==== DataNode
81
82Each DataNode server will have a DataNode log for HDFS, as well as a RegionServer log for HBase.
83
84Additionally, each DataNode server will also have a TaskTracker/NodeManager log for MapReduce task execution.
85
86[[trouble.log.levels]]
87=== Log Levels
88
89[[rpc.logging]]
90==== Enabling RPC-level logging
91
92Enabling the RPC-level logging on a RegionServer can often give insight on timings at the server.
93Once enabled, the amount of log spewed is voluminous.
94It is not recommended that you leave this logging on for more than short bursts of time.
95To enable RPC-level logging, browse to the RegionServer UI and click on _Log Level_.
96Set the log level to `DEBUG` for the package `org.apache.hadoop.ipc` (That's right, for `hadoop.ipc`, NOT, `hbase.ipc`). Then tail the RegionServers log.
97Analyze.
98
99To disable, set the logging level back to `INFO` level.
100
101[[trouble.log.gc]]
102=== JVM Garbage Collection Logs
103
104HBase is memory intensive, and using the default GC you can see long pauses in all threads including the _Juliet Pause_ aka "GC of Death". To help debug this or confirm this is happening GC logging can be turned on in the Java virtual machine.
105
106To enable, in _hbase-env.sh_, uncomment one of the below lines :
107
108[source,bourne]
109----
110
111# This enables basic gc logging to the .out file.
112# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"
113
114# This enables basic gc logging to its own file.
115# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"
116
117# This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
118# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"
119
120# If <FILE-PATH> is not replaced, the log file(.gc) would be generated in the HBASE_LOG_DIR.
121----
122
123At this point you should see logs like so:
124
125[source]
126----
127
12864898.952: [GC [1 CMS-initial-mark: 2811538K(3055704K)] 2812179K(3061272K), 0.0007360 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
12964898.953: [CMS-concurrent-mark-start]
13064898.971: [GC 64898.971: [ParNew: 5567K->576K(5568K), 0.0101110 secs] 2817105K->2812715K(3061272K), 0.0102200 secs] [Times: user=0.07 sys=0.00, real=0.01 secs]
131----
132
133In this section, the first line indicates a 0.0007360 second pause for the CMS to initially mark.
134This pauses the entire VM, all threads for that period of time.
135
136The third line indicates a "minor GC", which pauses the VM for 0.0101110 seconds - aka 10 milliseconds.
137It has reduced the "ParNew" from about 5.5m to 576k.
138Later on in this cycle we see:
139
140[source]
141----
142
14364901.445: [CMS-concurrent-mark: 1.542/2.492 secs] [Times: user=10.49 sys=0.33, real=2.49 secs]
14464901.445: [CMS-concurrent-preclean-start]
14564901.453: [GC 64901.453: [ParNew: 5505K->573K(5568K), 0.0062440 secs] 2868746K->2864292K(3061272K), 0.0063360 secs] [Times: user=0.05 sys=0.00, real=0.01 secs]
14664901.476: [GC 64901.476: [ParNew: 5563K->575K(5568K), 0.0072510 secs] 2869283K->2864837K(3061272K), 0.0073320 secs] [Times: user=0.05 sys=0.01, real=0.01 secs]
14764901.500: [GC 64901.500: [ParNew: 5517K->573K(5568K), 0.0120390 secs] 2869780K->2865267K(3061272K), 0.0121150 secs] [Times: user=0.09 sys=0.00, real=0.01 secs]
14864901.529: [GC 64901.529: [ParNew: 5507K->569K(5568K), 0.0086240 secs] 2870200K->2865742K(3061272K), 0.0087180 secs] [Times: user=0.05 sys=0.00, real=0.01 secs]
14964901.554: [GC 64901.555: [ParNew: 5516K->575K(5568K), 0.0107130 secs] 2870689K->2866291K(3061272K), 0.0107820 secs] [Times: user=0.06 sys=0.00, real=0.01 secs]
15064901.578: [CMS-concurrent-preclean: 0.070/0.133 secs] [Times: user=0.48 sys=0.01, real=0.14 secs]
15164901.578: [CMS-concurrent-abortable-preclean-start]
15264901.584: [GC 64901.584: [ParNew: 5504K->571K(5568K), 0.0087270 secs] 2871220K->2866830K(3061272K), 0.0088220 secs] [Times: user=0.05 sys=0.00, real=0.01 secs]
15364901.609: [GC 64901.609: [ParNew: 5512K->569K(5568K), 0.0063370 secs] 2871771K->2867322K(3061272K), 0.0064230 secs] [Times: user=0.06 sys=0.00, real=0.01 secs]
15464901.615: [CMS-concurrent-abortable-preclean: 0.007/0.037 secs] [Times: user=0.13 sys=0.00, real=0.03 secs]
15564901.616: [GC[YG occupancy: 645 K (5568 K)]64901.616: [Rescan (parallel) , 0.0020210 secs]64901.618: [weak refs processing, 0.0027950 secs] [1 CMS-remark: 2866753K(3055704K)] 2867399K(3061272K), 0.0049380 secs] [Times: user=0.00 sys=0.01, real=0.01 secs]
15664901.621: [CMS-concurrent-sweep-start]
157----
158
159The first line indicates that the CMS concurrent mark (finding garbage) has taken 2.4 seconds.
160But this is a _concurrent_ 2.4 seconds, Java has not been paused at any point in time.
161
162There are a few more minor GCs, then there is a pause at the 2nd last line:
163[source]
164----
165
16664901.616: [GC[YG occupancy: 645 K (5568 K)]64901.616: [Rescan (parallel) , 0.0020210 secs]64901.618: [weak refs processing, 0.0027950 secs] [1 CMS-remark: 2866753K(3055704K)] 2867399K(3061272K), 0.0049380 secs] [Times: user=0.00 sys=0.01, real=0.01 secs]
167----
168
169The pause here is 0.0049380 seconds (aka 4.9 milliseconds) to 'remark' the heap.
170
171At this point the sweep starts, and you can watch the heap size go down:
172
173[source]
174----
175
17664901.637: [GC 64901.637: [ParNew: 5501K->569K(5568K), 0.0097350 secs] 2871958K->2867441K(3061272K), 0.0098370 secs] [Times: user=0.05 sys=0.00, real=0.01 secs]
177...  lines removed ...
17864904.936: [GC 64904.936: [ParNew: 5532K->568K(5568K), 0.0070720 secs] 1365024K->1360689K(3061272K), 0.0071930 secs] [Times: user=0.05 sys=0.00, real=0.01 secs]
17964904.953: [CMS-concurrent-sweep: 2.030/3.332 secs] [Times: user=9.57 sys=0.26, real=3.33 secs]
180----
181
182At this point, the CMS sweep took 3.332 seconds, and heap went from about ~ 2.8 GB to 1.3 GB (approximate).
183
184The key points here is to keep all these pauses low.
185CMS pauses are always low, but if your ParNew starts growing, you can see minor GC pauses approach 100ms, exceed 100ms and hit as high at 400ms.
186
187This can be due to the size of the ParNew, which should be relatively small.
188If your ParNew is very large after running HBase for a while, in one example a ParNew was about 150MB, then you might have to constrain the size of ParNew (The larger it is, the longer the collections take but if it's too small, objects are promoted to old gen too quickly). In the below we constrain new gen size to 64m.
189
190Add the below line in _hbase-env.sh_:
191[source,bourne]
192----
193
194export SERVER_GC_OPTS="$SERVER_GC_OPTS -XX:NewSize=64m -XX:MaxNewSize=64m"
195----
196
197Similarly, to enable GC logging for client processes, uncomment one of the below lines in _hbase-env.sh_:
198
199[source,bourne]
200----
201
202# This enables basic gc logging to the .out file.
203# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"
204
205# This enables basic gc logging to its own file.
206# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"
207
208# This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.
209# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"
210
211# If <FILE-PATH> is not replaced, the log file(.gc) would be generated in the HBASE_LOG_DIR .
212----
213
214For more information on GC pauses, see the link:http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/[3 part blog post] by Todd Lipcon and <<gcpause>> above.
215
216[[trouble.resources]]
217== Resources
218
219[[trouble.resources.searchhadoop]]
220=== search-hadoop.com
221
222link:http://search-hadoop.com[search-hadoop.com] indexes all the mailing lists and is great for historical searches.
223Search here first when you have an issue as its more than likely someone has already had your problem.
224
225[[trouble.resources.lists]]
226=== Mailing Lists
227
228Ask a question on the link:http://hbase.apache.org/mail-lists.html[Apache HBase mailing lists].
229The 'dev' mailing list is aimed at the community of developers actually building Apache HBase and for features currently under development, and 'user' is generally used for questions on released versions of Apache HBase.
230Before going to the mailing list, make sure your question has not already been answered by searching the mailing list archives first.
231Use <<trouble.resources.searchhadoop>>.
232Take some time crafting your question.
233See link:http://www.mikeash.com/getting_answers.html[Getting Answers] for ideas on crafting good questions.
234A quality question that includes all context and exhibits evidence the author has tried to find answers in the manual and out on lists is more likely to get a prompt response.
235
236[[trouble.resources.irc]]
237=== IRC
238
239#hbase on irc.freenode.net
240
241[[trouble.resources.jira]]
242=== JIRA
243
244link:https://issues.apache.org/jira/browse/HBASE[JIRA] is also really helpful when looking for Hadoop/HBase-specific issues.
245
246[[trouble.tools]]
247== Tools
248
249[[trouble.tools.builtin]]
250=== Builtin Tools
251
252[[trouble.tools.builtin.webmaster]]
253==== Master Web Interface
254
255The Master starts a web-interface on port 16010 by default.
256(Up to and including 0.98 this was port 60010)
257
258The Master web UI lists created tables and their definition (e.g., ColumnFamilies, blocksize, etc.). Additionally, the available RegionServers in the cluster are listed along with selected high-level metrics (requests, number of regions, usedHeap, maxHeap). The Master web UI allows navigation to each RegionServer's web UI.
259
260[[trouble.tools.builtin.webregion]]
261==== RegionServer Web Interface
262
263RegionServers starts a web-interface on port 16030 by default.
264(Up to an including 0.98 this was port 60030)
265
266The RegionServer web UI lists online regions and their start/end keys, as well as point-in-time RegionServer metrics (requests, regions, storeFileIndexSize, compactionQueueSize, etc.).
267
268See <<hbase_metrics>> for more information in metric definitions.
269
270[[trouble.tools.builtin.zkcli]]
271==== zkcli
272
273`zkcli` is a very useful tool for investigating ZooKeeper-related issues.
274To invoke:
275[source,bourne]
276----
277./hbase zkcli -server host:port <cmd> <args>
278----
279
280The commands (and arguments) are:
281
282[source]
283----
284  connect host:port
285  get path [watch]
286  ls path [watch]
287  set path data [version]
288  delquota [-n|-b] path
289  quit
290  printwatches on|off
291  create [-s] [-e] path data acl
292  stat path [watch]
293  close
294  ls2 path [watch]
295  history
296  listquota path
297  setAcl path acl
298  getAcl path
299  sync path
300  redo cmdno
301  addauth scheme auth
302  delete path [version]
303  setquota -n|-b val path
304----
305
306[[trouble.tools.external]]
307=== External Tools
308
309[[trouble.tools.tail]]
310==== tail
311
312`tail` is the command line tool that lets you look at the end of a file.
313Add the `-f` option and it will refresh when new data is available.
314It's useful when you are wondering what's happening, for example, when a cluster is taking a long time to shutdown or startup as you can just fire a new terminal and tail the master log (and maybe a few RegionServers).
315
316[[trouble.tools.top]]
317==== top
318
319`top` is probably one of the most important tools when first trying to see what's running on a machine and how the resources are consumed.
320Here's an example from production system:
321
322[source]
323----
324top - 14:46:59 up 39 days, 11:55,  1 user,  load average: 3.75, 3.57, 3.84
325Tasks: 309 total,   1 running, 308 sleeping,   0 stopped,   0 zombie
326Cpu(s):  4.5%us,  1.6%sy,  0.0%ni, 91.7%id,  1.4%wa,  0.1%hi,  0.6%si,  0.0%st
327Mem:  24414432k total, 24296956k used,   117476k free,     7196k buffers
328Swap: 16008732k total,	14348k used, 15994384k free, 11106908k cached
329
330  PID USER  	PR  NI  VIRT  RES  SHR S %CPU %MEM	TIME+  COMMAND
33115558 hadoop	18  -2 3292m 2.4g 3556 S   79 10.4   6523:52 java
33213268 hadoop	18  -2 8967m 8.2g 4104 S   21 35.1   5170:30 java
333 8895 hadoop	18  -2 1581m 497m 3420 S   11  2.1   4002:32 java
334335----
336
337Here we can see that the system load average during the last five minutes is 3.75, which very roughly means that on average 3.75 threads were waiting for CPU time during these 5 minutes.
338In general, the _perfect_ utilization equals to the number of cores, under that number the machine is under utilized and over that the machine is over utilized.
339This is an important concept, see this article to understand it more: http://www.linuxjournal.com/article/9001.
340
341Apart from load, we can see that the system is using almost all its available RAM but most of it is used for the OS cache (which is good). The swap only has a few KBs in it and this is wanted, high numbers would indicate swapping activity which is the nemesis of performance of Java systems.
342Another way to detect swapping is when the load average goes through the roof (although this could also be caused by things like a dying disk, among others).
343
344The list of processes isn't super useful by default, all we know is that 3 java processes are using about 111% of the CPUs.
345To know which is which, simply type `c` and each line will be expanded.
346Typing `1` will give you the detail of how each CPU is used instead of the average for all of them like shown here.
347
348[[trouble.tools.jps]]
349==== jps
350
351`jps` is shipped with every JDK and gives the java process ids for the current user (if root, then it gives the ids for all users). Example:
352
353[source,bourne]
354----
355hadoop@sv4borg12:~$ jps
3561322 TaskTracker
35717789 HRegionServer
35827862 Child
3591158 DataNode
36025115 HQuorumPeer
3612950 Jps
36219750 ThriftServer
36318776 jmx
364----
365
366In order, we see a:
367
368* Hadoop TaskTracker, manages the local Childs
369* HBase RegionServer, serves regions
370* Child, its MapReduce task, cannot tell which type exactly
371* Hadoop TaskTracker, manages the local Childs
372* Hadoop DataNode, serves blocks
373* HQuorumPeer, a ZooKeeper ensemble member
374* Jps, well... it's the current process
375* ThriftServer, it's a special one will be running only if thrift was started
376* jmx, this is a local process that's part of our monitoring platform ( poorly named maybe). You probably don't have that.
377
378You can then do stuff like checking out the full command line that started the process:
379
380[source,bourne]
381----
382hadoop@sv4borg12:~$ ps aux | grep HRegionServer
383hadoop   17789  155 35.2 9067824 8604364 ?     S&lt;l  Mar04 9855:48 /usr/java/jdk1.6.0_14/bin/java -Xmx8000m -XX:+DoEscapeAnalysis -XX:+AggressiveOpts -XX:+UseConcMarkSweepGC -XX:NewSize=64m -XX:MaxNewSize=64m -XX:CMSInitiatingOccupancyFraction=88 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:/export1/hadoop/logs/gc-hbase.log -Dcom.sun.management.jmxremote.port=10102 -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.password.file=/home/hadoop/hbase/conf/jmxremote.password -Dcom.sun.management.jmxremote -Dhbase.log.dir=/export1/hadoop/logs -Dhbase.log.file=hbase-hadoop-regionserver-sv4borg12.log -Dhbase.home.dir=/home/hadoop/hbase -Dhbase.id.str=hadoop -Dhbase.root.logger=INFO,DRFA -Djava.library.path=/home/hadoop/hbase/lib/native/Linux-amd64-64 -classpath /home/hadoop/hbase/bin/../conf:[many jars]:/home/hadoop/hadoop/conf org.apache.hadoop.hbase.regionserver.HRegionServer start
384----
385
386[[trouble.tools.jstack]]
387==== jstack
388
389`jstack` is one of the most important tools when trying to figure out what a java process is doing apart from looking at the logs.
390It has to be used in conjunction with jps in order to give it a process id.
391It shows a list of threads, each one has a name, and they appear in the order that they were created (so the top ones are the most recent threads). Here are a few example:
392
393The main thread of a RegionServer waiting for something to do from the master:
394
395[source]
396----
397"regionserver60020" prio=10 tid=0x0000000040ab4000 nid=0x45cf waiting on condition [0x00007f16b6a96000..0x00007f16b6a96a70]
398java.lang.Thread.State: TIMED_WAITING (parking)
399    at sun.misc.Unsafe.park(Native Method)
400    - parking to wait for  <0x00007f16cd5c2f30> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
401    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
402    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963)
403    at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:395)
404    at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:647)
405    at java.lang.Thread.run(Thread.java:619)
406----
407
408The MemStore flusher thread that is currently flushing to a file:
409
410[source]
411----
412"regionserver60020.cacheFlusher" daemon prio=10 tid=0x0000000040f4e000 nid=0x45eb in Object.wait() [0x00007f16b5b86000..0x00007f16b5b87af0]
413java.lang.Thread.State: WAITING (on object monitor)
414    at java.lang.Object.wait(Native Method)
415    at java.lang.Object.wait(Object.java:485)
416    at org.apache.hadoop.ipc.Client.call(Client.java:803)
417    - locked <0x00007f16cb14b3a8> (a org.apache.hadoop.ipc.Client$Call)
418    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:221)
419    at $Proxy1.complete(Unknown Source)
420    at sun.reflect.GeneratedMethodAccessor38.invoke(Unknown Source)
421    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
422    at java.lang.reflect.Method.invoke(Method.java:597)
423    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
424    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
425    at $Proxy1.complete(Unknown Source)
426    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3390)
427    - locked <0x00007f16cb14b470> (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
428    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3304)
429    at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
430    at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
431    at org.apache.hadoop.hbase.io.hfile.HFile$Writer.close(HFile.java:650)
432    at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.close(StoreFile.java:853)
433    at org.apache.hadoop.hbase.regionserver.Store.internalFlushCache(Store.java:467)
434    - locked <0x00007f16d00e6f08> (a java.lang.Object)
435    at org.apache.hadoop.hbase.regionserver.Store.flushCache(Store.java:427)
436    at org.apache.hadoop.hbase.regionserver.Store.access$100(Store.java:80)
437    at org.apache.hadoop.hbase.regionserver.Store$StoreFlusherImpl.flushCache(Store.java:1359)
438    at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:907)
439    at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:834)
440    at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:786)
441    at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:250)
442    at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:224)
443    at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:146)
444----
445
446A handler thread that's waiting for stuff to do (like put, delete, scan, etc.):
447
448[source]
449----
450"IPC Server handler 16 on 60020" daemon prio=10 tid=0x00007f16b011d800 nid=0x4a5e waiting on condition [0x00007f16afefd000..0x00007f16afefd9f0]
451   java.lang.Thread.State: WAITING (parking)
452          at sun.misc.Unsafe.park(Native Method)
453          - parking to wait for  <0x00007f16cd3f8dd8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
454          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
455          at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
456          at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358)
457          at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1013)
458----
459
460And one that's busy doing an increment of a counter (it's in the phase where it's trying to create a scanner in order to read the last value):
461
462[source]
463----
464"IPC Server handler 66 on 60020" daemon prio=10 tid=0x00007f16b006e800 nid=0x4a90 runnable [0x00007f16acb77000..0x00007f16acb77cf0]
465   java.lang.Thread.State: RUNNABLE
466          at org.apache.hadoop.hbase.regionserver.KeyValueHeap.<init>(KeyValueHeap.java:56)
467          at org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:79)
468          at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:1202)
469          at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.<init>(HRegion.java:2209)
470          at org.apache.hadoop.hbase.regionserver.HRegion.instantiateInternalScanner(HRegion.java:1063)
471          at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1055)
472          at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1039)
473          at org.apache.hadoop.hbase.regionserver.HRegion.getLastIncrement(HRegion.java:2875)
474          at org.apache.hadoop.hbase.regionserver.HRegion.incrementColumnValue(HRegion.java:2978)
475          at org.apache.hadoop.hbase.regionserver.HRegionServer.incrementColumnValue(HRegionServer.java:2433)
476          at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
477          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
478          at java.lang.reflect.Method.invoke(Method.java:597)
479          at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:560)
480          at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1027)
481----
482
483A thread that receives data from HDFS:
484
485[source]
486----
487"IPC Client (47) connection to sv4borg9/10.4.24.40:9000 from hadoop" daemon prio=10 tid=0x00007f16a02d0000 nid=0x4fa3 runnable [0x00007f16b517d000..0x00007f16b517dbf0]
488   java.lang.Thread.State: RUNNABLE
489          at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
490          at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
491          at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
492          at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
493          - locked <0x00007f17d5b68c00> (a sun.nio.ch.Util$1)
494          - locked <0x00007f17d5b68be8> (a java.util.Collections$UnmodifiableSet)
495          - locked <0x00007f1877959b50> (a sun.nio.ch.EPollSelectorImpl)
496          at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
497          at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:332)
498          at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
499          at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
500          at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
501          at java.io.FilterInputStream.read(FilterInputStream.java:116)
502          at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:304)
503          at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
504          at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
505          - locked <0x00007f1808539178> (a java.io.BufferedInputStream)
506          at java.io.DataInputStream.readInt(DataInputStream.java:370)
507          at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:569)
508          at org.apache.hadoop.ipc.Client$Connection.run(Client.java:477)
509----
510
511And here is a master trying to recover a lease after a RegionServer died:
512
513[source]
514----
515"LeaseChecker" daemon prio=10 tid=0x00000000407ef800 nid=0x76cd waiting on condition [0x00007f6d0eae2000..0x00007f6d0eae2a70]
516--
517   java.lang.Thread.State: WAITING (on object monitor)
518          at java.lang.Object.wait(Native Method)
519          at java.lang.Object.wait(Object.java:485)
520          at org.apache.hadoop.ipc.Client.call(Client.java:726)
521          - locked <0x00007f6d1cd28f80> (a org.apache.hadoop.ipc.Client$Call)
522          at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
523          at $Proxy1.recoverBlock(Unknown Source)
524          at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2636)
525          at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:2832)
526          at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:529)
527          at org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:186)
528          at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:530)
529          at org.apache.hadoop.hbase.util.FSUtils.recoverFileLease(FSUtils.java:619)
530          at org.apache.hadoop.hbase.regionserver.wal.HLog.splitLog(HLog.java:1322)
531          at org.apache.hadoop.hbase.regionserver.wal.HLog.splitLog(HLog.java:1210)
532          at org.apache.hadoop.hbase.master.HMaster.splitLogAfterStartup(HMaster.java:648)
533          at org.apache.hadoop.hbase.master.HMaster.joinCluster(HMaster.java:572)
534          at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:503)
535----
536
537[[trouble.tools.opentsdb]]
538==== OpenTSDB
539
540link:http://opentsdb.net[OpenTSDB] is an excellent alternative to Ganglia as it uses Apache HBase to store all the time series and doesn't have to downsample.
541Monitoring your own HBase cluster that hosts OpenTSDB is a good exercise.
542
543Here's an example of a cluster that's suffering from hundreds of compactions launched almost all around the same time, which severely affects the IO performance: (TODO: insert graph plotting compactionQueueSize)
544
545It's a good practice to build dashboards with all the important graphs per machine and per cluster so that debugging issues can be done with a single quick look.
546For example, at StumbleUpon there's one dashboard per cluster with the most important metrics from both the OS and Apache HBase.
547You can then go down at the machine level and get even more detailed metrics.
548
549[[trouble.tools.clustersshtop]]
550==== clusterssh+top
551
552clusterssh+top, it's like a poor man's monitoring system and it can be quite useful when you have only a few machines as it's very easy to setup.
553Starting clusterssh will give you one terminal per machine and another terminal in which whatever you type will be retyped in every window.
554This means that you can type `top` once and it will start it for all of your machines at the same time giving you full view of the current state of your cluster.
555You can also tail all the logs at the same time, edit files, etc.
556
557[[trouble.client]]
558== Client
559
560For more information on the HBase client, see <<architecture.client,client>>.
561
562=== Missed Scan Results Due To Mismatch Of `hbase.client.scanner.max.result.size` Between Client and Server
563If either the client or server version is lower than 0.98.11/1.0.0 and the server
564has a smaller value for `hbase.client.scanner.max.result.size` than the client, scan
565requests that reach the server's `hbase.client.scanner.max.result.size` are likely
566to miss data. In particular, 0.98.11 defaults `hbase.client.scanner.max.result.size`
567to 2 MB but other versions default to larger values. For this reason, be very careful
568using 0.98.11 servers with any other client version.
569
570[[trouble.client.scantimeout]]
571=== ScannerTimeoutException or UnknownScannerException
572
573This is thrown if the time between RPC calls from the client to RegionServer exceeds the scan timeout.
574For example, if `Scan.setCaching` is set to 500, then there will be an RPC call to fetch the next batch of rows every 500 `.next()` calls on the ResultScanner because data is being transferred in blocks of 500 rows to the client.
575Reducing the setCaching value may be an option, but setting this value too low makes for inefficient processing on numbers of rows.
576
577See <<perf.hbase.client.caching>>.
578
579=== Performance Differences in Thrift and Java APIs
580
581Poor performance, or even `ScannerTimeoutExceptions`, can occur if `Scan.setCaching` is too high, as discussed in <<trouble.client.scantimeout>>.
582If the Thrift client uses the wrong caching settings for a given workload, performance can suffer compared to the Java API.
583To set caching for a given scan in the Thrift client, use the `scannerGetList(scannerId, numRows)` method, where `numRows` is an integer representing the number of rows to cache.
584In one case, it was found that reducing the cache for Thrift scans from 1000 to 100 increased performance to near parity with the Java API given the same queries.
585
586See also Jesse Andersen's link:http://blog.cloudera.com/blog/2014/04/how-to-use-the-hbase-thrift-interface-part-3-using-scans/[blog post] about using Scans with Thrift.
587
588[[trouble.client.lease.exception]]
589=== `LeaseException` when calling `Scanner.next`
590
591In some situations clients that fetch data from a RegionServer get a LeaseException instead of the usual <<trouble.client.scantimeout>>.
592Usually the source of the exception is `org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:230)` (line number may vary). It tends to happen in the context of a slow/freezing `RegionServer#next` call.
593It can be prevented by having `hbase.rpc.timeout` > `hbase.regionserver.lease.period`.
594Harsh J investigated the issue as part of the mailing list thread link:http://mail-archives.apache.org/mod_mbox/hbase-user/201209.mbox/%3CCAOcnVr3R-LqtKhFsk8Bhrm-YW2i9O6J6Fhjz2h7q6_sxvwd2yw%40mail.gmail.com%3E[HBase, mail # user - Lease does not exist exceptions]
595
596[[trouble.client.scarylogs]]
597=== Shell or client application throws lots of scary exceptions during normal operation
598
599Since 0.20.0 the default log level for `org.apache.hadoop.hbase.*`is DEBUG.
600
601On your clients, edit _$HBASE_HOME/conf/log4j.properties_ and change this: `log4j.logger.org.apache.hadoop.hbase=DEBUG` to this: `log4j.logger.org.apache.hadoop.hbase=INFO`, or even `log4j.logger.org.apache.hadoop.hbase=WARN`.
602
603[[trouble.client.longpauseswithcompression]]
604=== Long Client Pauses With Compression
605
606This is a fairly frequent question on the Apache HBase dist-list.
607The scenario is that a client is typically inserting a lot of data into a relatively un-optimized HBase cluster.
608Compression can exacerbate the pauses, although it is not the source of the problem.
609
610See <<precreate.regions>> on the pattern for pre-creating regions and confirm that the table isn't starting with a single region.
611
612See <<perf.configurations>> for cluster configuration, particularly `hbase.hstore.blockingStoreFiles`, `hbase.hregion.memstore.block.multiplier`, `MAX_FILESIZE` (region size), and `MEMSTORE_FLUSHSIZE.`
613
614A slightly longer explanation of why pauses can happen is as follows: Puts are sometimes blocked on the MemStores which are blocked by the flusher thread which is blocked because there are too many files to compact because the compactor is given too many small files to compact and has to compact the same data repeatedly.
615This situation can occur even with minor compactions.
616Compounding this situation, Apache HBase doesn't compress data in memory.
617Thus, the 64MB that lives in the MemStore could become a 6MB file after compression - which results in a smaller StoreFile.
618The upside is that more data is packed into the same region, but performance is achieved by being able to write larger files - which is why HBase waits until the flushsize before writing a new StoreFile.
619And smaller StoreFiles become targets for compaction.
620Without compression the files are much bigger and don't need as much compaction, however this is at the expense of I/O.
621
622For additional information, see this thread on link:http://search-hadoop.com/m/WUnLM6ojHm1/Long+client+pauses+with+compression&subj=Long+client+pauses+with+compression[Long client pauses with compression].
623
624[[trouble.client.security.rpc.krb]]
625=== Secure Client Connect ([Caused by GSSException: No valid credentials provided...])
626
627You may encounter the following error:
628
629----
630Secure Client Connect ([Caused by GSSException: No valid credentials provided
631        (Mechanism level: Request is a replay (34) V PROCESS_TGS)])
632----
633
634This issue is caused by bugs in the MIT Kerberos replay_cache component, link:http://krbdev.mit.edu/rt/Ticket/Display.html?id=1201[#1201] and link:http://krbdev.mit.edu/rt/Ticket/Display.html?id=5924[#5924].
635These bugs caused the old version of krb5-server to erroneously block subsequent requests sent from a Principal.
636This caused krb5-server to block the connections sent from one Client (one HTable instance with multi-threading connection instances for each RegionServer); Messages, such as `Request is a replay (34)`, are logged in the client log You can ignore the messages, because HTable will retry 5 * 10 (50) times for each failed connection by default.
637HTable will throw IOException if any connection to the RegionServer fails after the retries, so that the user client code for HTable instance can handle it further.
638NOTE: `HTable` is deprecated in HBase 1.0, in favor of `Table`.
639
640Alternatively, update krb5-server to a version which solves these issues, such as krb5-server-1.10.3.
641See JIRA link:https://issues.apache.org/jira/browse/HBASE-10379[HBASE-10379] for more details.
642
643[[trouble.client.zookeeper]]
644=== ZooKeeper Client Connection Errors
645
646Errors like this...
647
648[source]
649----
650
65111/07/05 11:26:41 WARN zookeeper.ClientCnxn: Session 0x0 for server null,
652 unexpected error, closing socket connection and attempting reconnect
653 java.net.ConnectException: Connection refused: no further information
654        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
655        at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
656        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
657 11/07/05 11:26:43 INFO zookeeper.ClientCnxn: Opening socket connection to
658 server localhost/127.0.0.1:2181
659 11/07/05 11:26:44 WARN zookeeper.ClientCnxn: Session 0x0 for server null,
660 unexpected error, closing socket connection and attempting reconnect
661 java.net.ConnectException: Connection refused: no further information
662        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
663        at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
664        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
665 11/07/05 11:26:45 INFO zookeeper.ClientCnxn: Opening socket connection to
666 server localhost/127.0.0.1:2181
667----
668
669...are either due to ZooKeeper being down, or unreachable due to network issues.
670
671The utility <<trouble.tools.builtin.zkcli>> may help investigate ZooKeeper issues.
672
673[[trouble.client.oome.directmemory.leak]]
674=== Client running out of memory though heap size seems to be stable (but the off-heap/direct heap keeps growing)
675
676You are likely running into the issue that is described and worked through in the mail thread link:http://search-hadoop.com/m/ubhrX8KvcH/Suspected+memory+leak&subj=Re+Suspected+memory+leak[HBase, mail # user - Suspected memory leak] and continued over in link:http://search-hadoop.com/m/p2Agc1Zy7Va/MaxDirectMemorySize+Was%253A+Suspected+memory+leak&subj=Re+FeedbackRe+Suspected+memory+leak[HBase, mail # dev - FeedbackRe: Suspected memory leak].
677A workaround is passing your client-side JVM a reasonable value for `-XX:MaxDirectMemorySize`.
678By default, the `MaxDirectMemorySize` is equal to your `-Xmx` max heapsize setting (if `-Xmx` is set). Try setting it to something smaller (for example, one user had success setting it to `1g` when they had a client-side heap of `12g`). If you set it too small, it will bring on `FullGCs` so keep it a bit hefty.
679You want to make this setting client-side only especially if you are running the new experimental server-side off-heap cache since this feature depends on being able to use big direct buffers (You may have to keep separate client-side and server-side config dirs).
680
681[[trouble.client.slowdown.admin]]
682=== Client Slowdown When Calling Admin Methods (flush, compact, etc.)
683
684This is a client issue fixed by link:https://issues.apache.org/jira/browse/HBASE-5073[HBASE-5073] in 0.90.6.
685There was a ZooKeeper leak in the client and the client was getting pummeled by ZooKeeper events with each additional invocation of the admin API.
686
687[[trouble.client.security.rpc]]
688=== Secure Client Cannot Connect ([Caused by GSSException: No valid credentials provided(Mechanism level: Failed to find any Kerberos tgt)])
689
690There can be several causes that produce this symptom.
691
692First, check that you have a valid Kerberos ticket.
693One is required in order to set up communication with a secure Apache HBase cluster.
694Examine the ticket currently in the credential cache, if any, by running the `klist` command line utility.
695If no ticket is listed, you must obtain a ticket by running the `kinit` command with either a keytab specified, or by interactively entering a password for the desired principal.
696
697Then, consult the link:http://docs.oracle.com/javase/1.5.0/docs/guide/security/jgss/tutorials/Troubleshooting.html[Java Security Guide troubleshooting section].
698The most common problem addressed there is resolved by setting `javax.security.auth.useSubjectCredsOnly` system property value to `false`.
699
700Because of a change in the format in which MIT Kerberos writes its credentials cache, there is a bug in the Oracle JDK 6 Update 26 and earlier that causes Java to be unable to read the Kerberos credentials cache created by versions of MIT Kerberos 1.8.1 or higher.
701If you have this problematic combination of components in your environment, to work around this problem, first log in with `kinit` and then immediately refresh the credential cache with `kinit -R`.
702The refresh will rewrite the credential cache without the problematic formatting.
703
704Finally, depending on your Kerberos configuration, you may need to install the link:http://docs.oracle.com/javase/1.4.2/docs/guide/security/jce/JCERefGuide.html[Java Cryptography Extension], or JCE.
705Insure the JCE jars are on the classpath on both server and client systems.
706
707You may also need to download the link:http://www.oracle.com/technetwork/java/javase/downloads/jce-6-download-429243.html[unlimited strength JCE policy files].
708Uncompress and extract the downloaded file, and install the policy jars into _<java-home>/lib/security_.
709
710[[trouble.mapreduce]]
711== MapReduce
712
713[[trouble.mapreduce.local]]
714=== You Think You're On The Cluster, But You're Actually Local
715
716This following stacktrace happened using `ImportTsv`, but things like this can happen on any job with a mis-configuration.
717
718[source,text]
719----
720    WARN mapred.LocalJobRunner: job_local_0001
721java.lang.IllegalArgumentException: Can't read partitions file
722       at org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:111)
723       at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
724       at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
725       at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:560)
726       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:639)
727       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
728       at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)
729Caused by: java.io.FileNotFoundException: File _partition.lst does not exist.
730       at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:383)
731       at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
732       at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:776)
733       at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
734       at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
735       at org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.readPartitions(TotalOrderPartitioner.java:296)
736----
737
738...see the critical portion of the stack? It's...
739
740[source]
741----
742at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)
743----
744
745LocalJobRunner means the job is running locally, not on the cluster.
746
747To solve this problem, you should run your MR job with your `HADOOP_CLASSPATH` set to include the HBase dependencies.
748The "hbase classpath" utility can be used to do this easily.
749For example (substitute VERSION with your HBase version):
750
751[source,bourne]
752----
753HADOOP_CLASSPATH=`hbase classpath` hadoop jar $HBASE_HOME/hbase-server-VERSION.jar rowcounter usertable
754----
755
756See http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/package-summary.html#classpathfor more information on HBase MapReduce jobs and classpaths.
757
758[[trouble.hbasezerocopybytestring]]
759=== Launching a job, you get java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString or class com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass com.google.protobuf.LiteralByteString
760
761See link:https://issues.apache.org/jira/browse/HBASE-10304[HBASE-10304 Running an hbase job jar: IllegalAccessError: class com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass com.google.protobuf.LiteralByteString] and link:https://issues.apache.org/jira/browse/HBASE-11118[HBASE-11118 non environment variable solution for "IllegalAccessError: class com.google.protobuf.ZeroCopyLiteralByteString cannot access its superclass com.google.protobuf.LiteralByteString"].
762The issue can also show up when trying to run spark jobs.
763See link:https://issues.apache.org/jira/browse/HBASE-10877[HBASE-10877 HBase non-retriable exception list should be expanded].
764
765[[trouble.namenode]]
766== NameNode
767
768For more information on the NameNode, see <<arch.hdfs>>.
769
770[[trouble.namenode.disk]]
771=== HDFS Utilization of Tables and Regions
772
773To determine how much space HBase is using on HDFS use the `hadoop` shell commands from the NameNode.
774For example...
775
776
777[source,bourne]
778----
779hadoop fs -dus /hbase/
780----
781...returns the summarized disk utilization for all HBase objects.
782
783
784[source,bourne]
785----
786hadoop fs -dus /hbase/myTable
787----
788...returns the summarized disk utilization for the HBase table 'myTable'.
789
790
791[source,bourne]
792----
793hadoop fs -du /hbase/myTable
794----
795...returns a list of the regions under the HBase table 'myTable' and their disk utilization.
796
797For more information on HDFS shell commands, see the link:http://hadoop.apache.org/common/docs/current/file_system_shell.html[HDFS FileSystem Shell documentation].
798
799[[trouble.namenode.hbase.objects]]
800=== Browsing HDFS for HBase Objects
801
802Sometimes it will be necessary to explore the HBase objects that exist on HDFS.
803These objects could include the WALs (Write Ahead Logs), tables, regions, StoreFiles, etc.
804The easiest way to do this is with the NameNode web application that runs on port 50070.
805The NameNode web application will provide links to the all the DataNodes in the cluster so that they can be browsed seamlessly.
806
807The HDFS directory structure of HBase tables in the cluster is...
808[source]
809----
810
811/hbase
812    /<Table>                    (Tables in the cluster)
813        /<Region>               (Regions for the table)
814            /<ColumnFamily>     (ColumnFamilies for the Region for the table)
815                /<StoreFile>    (StoreFiles for the ColumnFamily for the Regions for the table)
816----
817
818The HDFS directory structure of HBase WAL is..
819[source]
820----
821
822/hbase
823    /.logs
824        /<RegionServer>    (RegionServers)
825            /<WAL>         (WAL files for the RegionServer)
826----
827
828See the link:http://hadoop.apache.org/common/docs/current/hdfs_user_guide.html[HDFS User Guide] for other non-shell diagnostic utilities like `fsck`.
829
830[[trouble.namenode.0size.hlogs]]
831==== Zero size WALs with data in them
832
833Problem: when getting a listing of all the files in a RegionServer's _.logs_ directory, one file has a size of 0 but it contains data.
834
835Answer: It's an HDFS quirk.
836A file that's currently being written to will appear to have a size of 0 but once it's closed it will show its true size
837
838[[trouble.namenode.uncompaction]]
839==== Use Cases
840
841Two common use-cases for querying HDFS for HBase objects is research the degree of uncompaction of a table.
842If there are a large number of StoreFiles for each ColumnFamily it could indicate the need for a major compaction.
843Additionally, after a major compaction if the resulting StoreFile is "small" it could indicate the need for a reduction of ColumnFamilies for the table.
844
845=== Unexpected Filesystem Growth
846
847If you see an unexpected spike in filesystem usage by HBase, two possible culprits
848are snapshots and WALs.
849
850Snapshots::
851  When you create a snapshot, HBase retains everything it needs to recreate the table's
852  state at that time of the snapshot. This includes deleted cells or expired versions.
853  For this reason, your snapshot usage pattern should be well-planned, and you should
854  prune snapshots that you no longer need. Snapshots are stored in `/hbase/.snapshots`,
855  and archives needed to restore snapshots are stored in
856  `/hbase/.archive/<_tablename_>/<_region_>/<_column_family_>/`.
857
858  *Do not* manage snapshots or archives manually via HDFS. HBase provides APIs and
859  HBase Shell commands for managing them. For more information, see <<ops.snapshots>>.
860
861WAL::
862  Write-ahead logs (WALs) are stored in subdirectories of `/hbase/.logs/`, depending
863  on their status. Already-processed WALs are stored in `/hbase/.logs/oldWALs/` and
864  corrupt WALs are stored in `/hbase/.logs/.corrupt/` for examination.
865  If the size of any subdirectory of `/hbase/.logs/` is growing, examine the HBase
866  server logs to find the root cause for why WALs are not being processed correctly.
867
868*Do not* manage WALs manually via HDFS.
869
870[[trouble.network]]
871== Network
872
873[[trouble.network.spikes]]
874=== Network Spikes
875
876If you are seeing periodic network spikes you might want to check the `compactionQueues` to see if major compactions are happening.
877
878See <<managed.compactions>> for more information on managing compactions.
879
880[[trouble.network.loopback]]
881=== Loopback IP
882
883HBase expects the loopback IP Address to be 127.0.0.1.
884See the Getting Started section on <<loopback.ip>>.
885
886[[trouble.network.ints]]
887=== Network Interfaces
888
889Are all the network interfaces functioning correctly? Are you sure? See the Troubleshooting Case Study in <<trouble.casestudy>>.
890
891[[trouble.rs]]
892== RegionServer
893
894For more information on the RegionServers, see <<regionserver.arch>>.
895
896[[trouble.rs.startup]]
897=== Startup Errors
898
899[[trouble.rs.startup.master_no_region]]
900==== Master Starts, But RegionServers Do Not
901
902The Master believes the RegionServers have the IP of 127.0.0.1 - which is localhost and resolves to the master's own localhost.
903
904The RegionServers are erroneously informing the Master that their IP addresses are 127.0.0.1.
905
906Modify _/etc/hosts_ on the region servers, from...
907
908[source]
909----
910# Do not remove the following line, or various programs
911# that require network functionality will fail.
912127.0.0.1               fully.qualified.regionservername regionservername  localhost.localdomain localhost
913::1             localhost6.localdomain6 localhost6
914----
915
916\... to (removing the master node's name from localhost)...
917
918[source]
919----
920# Do not remove the following line, or various programs
921# that require network functionality will fail.
922127.0.0.1               localhost.localdomain localhost
923::1             localhost6.localdomain6 localhost6
924----
925
926[[trouble.rs.startup.compression]]
927==== Compression Link Errors
928
929Since compression algorithms such as LZO need to be installed and configured on each cluster this is a frequent source of startup error.
930If you see messages like this...
931
932[source]
933----
934
93511/02/20 01:32:15 ERROR lzo.GPLNativeCodeLoader: Could not load native gpl library
936java.lang.UnsatisfiedLinkError: no gplcompression in java.library.path
937        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1734)
938        at java.lang.Runtime.loadLibrary0(Runtime.java:823)
939        at java.lang.System.loadLibrary(System.java:1028)
940----
941
942\... then there is a path issue with the compression libraries.
943See the Configuration section on link:[LZO compression configuration].
944
945[[trouble.rs.runtime]]
946=== Runtime Errors
947
948[[trouble.rs.runtime.hang]]
949==== RegionServer Hanging
950
951Are you running an old JVM (< 1.6.0_u21?)? When you look at a thread dump, does it look like threads are BLOCKED but no one holds the lock all are blocked on? See link:https://issues.apache.org/jira/browse/HBASE-3622[HBASE 3622 Deadlock in
952            HBaseServer (JVM bug?)].
953Adding `-XX:+UseMembar` to the HBase `HBASE_OPTS` in _conf/hbase-env.sh_ may fix it.
954
955[[trouble.rs.runtime.filehandles]]
956==== java.io.IOException...(Too many open files)
957
958If you see log messages like this...
959
960[source]
961----
962
9632010-09-13 01:24:17,336 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:
964Disk-related IOException in BlockReceiver constructor. Cause is java.io.IOException: Too many open files
965        at java.io.UnixFileSystem.createFileExclusively(Native Method)
966        at java.io.File.createNewFile(File.java:883)
967----
968
969\... see the Getting Started section on link:[ulimit and nproc configuration].
970
971[[trouble.rs.runtime.xceivers]]
972==== xceiverCount 258 exceeds the limit of concurrent xcievers 256
973
974This typically shows up in the DataNode logs.
975
976See the Getting Started section on link:[xceivers configuration].
977
978[[trouble.rs.runtime.oom_nt]]
979==== System instability, and the presence of "java.lang.OutOfMemoryError: unable to createnew native thread in exceptions" HDFS DataNode logs or that of any system daemon
980
981See the Getting Started section on ulimit and nproc configuration.
982The default on recent Linux distributions is 1024 - which is far too low for HBase.
983
984[[trouble.rs.runtime.gc]]
985==== DFS instability and/or RegionServer lease timeouts
986
987If you see warning messages like this...
988
989[source]
990----
991
9922009-02-24 10:01:33,516 WARN org.apache.hadoop.hbase.util.Sleeper: We slept xxx ms, ten times longer than scheduled: 10000
9932009-02-24 10:01:33,516 WARN org.apache.hadoop.hbase.util.Sleeper: We slept xxx ms, ten times longer than scheduled: 15000
9942009-02-24 10:01:36,472 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report to master for xxx milliseconds - retrying
995----
996
997\... or see full GC compactions then you may be experiencing full GC's.
998
999[[trouble.rs.runtime.nolivenodes]]
1000==== "No live nodes contain current block" and/or YouAreDeadException
1001
1002These errors can happen either when running out of OS file handles or in periods of severe network problems where the nodes are unreachable.
1003
1004See the Getting Started section on ulimit and nproc configuration and check your network.
1005
1006[[trouble.rs.runtime.zkexpired]]
1007==== ZooKeeper SessionExpired events
1008
1009Master or RegionServers shutting down with messages like those in the logs:
1010
1011[source]
1012----
1013
1014WARN org.apache.zookeeper.ClientCnxn: Exception
1015closing session 0x278bd16a96000f to sun.nio.ch.SelectionKeyImpl@355811ec
1016java.io.IOException: TIMED OUT
1017       at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:906)
1018WARN org.apache.hadoop.hbase.util.Sleeper: We slept 79410ms, ten times longer than scheduled: 5000
1019INFO org.apache.zookeeper.ClientCnxn: Attempting connection to server hostname/IP:PORT
1020INFO org.apache.zookeeper.ClientCnxn: Priming connection to java.nio.channels.SocketChannel[connected local=/IP:PORT remote=hostname/IP:PORT]
1021INFO org.apache.zookeeper.ClientCnxn: Server connection successful
1022WARN org.apache.zookeeper.ClientCnxn: Exception closing session 0x278bd16a96000d to sun.nio.ch.SelectionKeyImpl@3544d65e
1023java.io.IOException: Session Expired
1024       at org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:589)
1025       at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:709)
1026       at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945)
1027ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: ZooKeeper session expired
1028----
1029
1030The JVM is doing a long running garbage collecting which is pausing every threads (aka "stop the world"). Since the RegionServer's local ZooKeeper client cannot send heartbeats, the session times out.
1031By design, we shut down any node that isn't able to contact the ZooKeeper ensemble after getting a timeout so that it stops serving data that may already be assigned elsewhere.
1032
1033* Make sure you give plenty of RAM (in _hbase-env.sh_), the default of 1GB won't be able to sustain long running imports.
1034* Make sure you don't swap, the JVM never behaves well under swapping.
1035* Make sure you are not CPU starving the RegionServer thread.
1036  For example, if you are running a MapReduce job using 6 CPU-intensive tasks on a machine with 4 cores, you are probably starving the RegionServer enough to create longer garbage collection pauses.
1037* Increase the ZooKeeper session timeout
1038
1039If you wish to increase the session timeout, add the following to your _hbase-site.xml_ to increase the timeout from the default of 60 seconds to 120 seconds.
1040
1041[source,xml]
1042----
1043<property>
1044  <name>zookeeper.session.timeout</name>
1045  <value>1200000</value>
1046</property>
1047<property>
1048  <name>hbase.zookeeper.property.tickTime</name>
1049  <value>6000</value>
1050</property>
1051----
1052
1053Be aware that setting a higher timeout means that the regions served by a failed RegionServer will take at least that amount of time to be transferred to another RegionServer.
1054For a production system serving live requests, we would instead recommend setting it lower than 1 minute and over-provision your cluster in order the lower the memory load on each machines (hence having less garbage to collect per machine).
1055
1056If this is happening during an upload which only happens once (like initially loading all your data into HBase), consider bulk loading.
1057
1058See <<trouble.zookeeper.general>> for other general information about ZooKeeper troubleshooting.
1059
1060[[trouble.rs.runtime.notservingregion]]
1061==== NotServingRegionException
1062
1063This exception is "normal" when found in the RegionServer logs at DEBUG level.
1064This exception is returned back to the client and then the client goes back to `hbase:meta` to find the new location of the moved region.
1065
1066However, if the NotServingRegionException is logged ERROR, then the client ran out of retries and something probably wrong.
1067
1068[[trouble.rs.runtime.double_listed_regions]]
1069==== Regions listed by domain name, then IP
1070
1071Fix your DNS.
1072In versions of Apache HBase before 0.92.x, reverse DNS needs to give same answer as forward lookup.
1073See link:https://issues.apache.org/jira/browse/HBASE-3431[HBASE 3431 RegionServer is not using the name given it by the master; double entry in master listing of servers] for gory details.
1074
1075[[brand.new.compressor]]
1076==== Logs flooded with '2011-01-10 12:40:48,407 INFO org.apache.hadoop.io.compress.CodecPool: Gotbrand-new compressor' messages
1077
1078We are not using the native versions of compression libraries.
1079See link:https://issues.apache.org/jira/browse/HBASE-1900[HBASE-1900 Put back native support when hadoop 0.21 is released].
1080Copy the native libs from hadoop under HBase lib dir or symlink them into place and the message should go away.
1081
1082[[trouble.rs.runtime.client_went_away]]
1083==== Server handler X on 60020 caught: java.nio.channels.ClosedChannelException
1084
1085If you see this type of message it means that the region server was trying to read/send data from/to a client but it already went away.
1086Typical causes for this are if the client was killed (you see a storm of messages like this when a MapReduce job is killed or fails) or if the client receives a SocketTimeoutException.
1087It's harmless, but you should consider digging in a bit more if you aren't doing something to trigger them.
1088
1089=== Snapshot Errors Due to Reverse DNS
1090
1091Several operations within HBase, including snapshots, rely on properly configured reverse DNS.
1092Some environments, such as Amazon EC2, have trouble with reverse DNS.
1093If you see errors like the following on your RegionServers, check your reverse DNS configuration:
1094
1095----
1096
10972013-05-01 00:04:56,356 DEBUG org.apache.hadoop.hbase.procedure.Subprocedure: Subprocedure 'backup1'
1098coordinator notified of 'acquire', waiting on 'reached' or 'abort' from coordinator.
1099----
1100
1101In general, the hostname reported by the RegionServer needs to be the same as the hostname the Master is trying to reach.
1102You can see a hostname mismatch by looking for the following type of message in the RegionServer's logs at start-up.
1103
1104----
1105
11062013-05-01 00:03:00,614 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Master passed us hostname
1107to use. Was=myhost-1234, Now=ip-10-55-88-99.ec2.internal
1108----
1109
1110[[trouble.rs.shutdown]]
1111=== Shutdown Errors
1112
1113
1114
1115[[trouble.master]]
1116== Master
1117
1118For more information on the Master, see <<architecture.master,master>>.
1119
1120[[trouble.master.startup]]
1121=== Startup Errors
1122
1123[[trouble.master.startup.migration]]
1124==== Master says that you need to run the HBase migrations script
1125
1126Upon running that, the HBase migrations script says no files in root directory.
1127
1128HBase expects the root directory to either not exist, or to have already been initialized by HBase running a previous time.
1129If you create a new directory for HBase using Hadoop DFS, this error will occur.
1130Make sure the HBase root directory does not currently exist or has been initialized by a previous run of HBase.
1131Sure fire solution is to just use Hadoop dfs to delete the HBase root and let HBase create and initialize the directory itself.
1132
1133[[trouble.master.startup.zk.buffer]]
1134==== Packet len6080218 is out of range!
1135
1136If you have many regions on your cluster and you see an error like that reported above in this sections title in your logs, see link:https://issues.apache.org/jira/browse/HBASE-4246[HBASE-4246 Cluster with too many regions cannot withstand some master failover scenarios].
1137
1138[[trouble.master.shutdown]]
1139=== Shutdown Errors
1140
1141
1142
1143[[trouble.zookeeper]]
1144== ZooKeeper
1145
1146[[trouble.zookeeper.startup]]
1147=== Startup Errors
1148
1149[[trouble.zookeeper.startup.address]]
1150==== Could not find my address: xyz in list of ZooKeeper quorum servers
1151
1152A ZooKeeper server wasn't able to start, throws that error.
1153xyz is the name of your server.
1154
1155This is a name lookup problem.
1156HBase tries to start a ZooKeeper server on some machine but that machine isn't able to find itself in the `hbase.zookeeper.quorum` configuration.
1157
1158Use the hostname presented in the error message instead of the value you used.
1159If you have a DNS server, you can set `hbase.zookeeper.dns.interface` and `hbase.zookeeper.dns.nameserver` in _hbase-site.xml_ to make sure it resolves to the correct FQDN.
1160
1161[[trouble.zookeeper.general]]
1162=== ZooKeeper, The Cluster Canary
1163
1164ZooKeeper is the cluster's "canary in the mineshaft". It'll be the first to notice issues if any so making sure its happy is the short-cut to a humming cluster.
1165
1166See the link:http://wiki.apache.org/hadoop/ZooKeeper/Troubleshooting[ZooKeeper Operating Environment Troubleshooting] page.
1167It has suggestions and tools for checking disk and networking performance; i.e.
1168the operating environment your ZooKeeper and HBase are running in.
1169
1170Additionally, the utility <<trouble.tools.builtin.zkcli>> may help investigate ZooKeeper issues.
1171
1172[[trouble.ec2]]
1173== Amazon EC2
1174
1175[[trouble.ec2.zookeeper]]
1176=== ZooKeeper does not seem to work on Amazon EC2
1177
1178HBase does not start when deployed as Amazon EC2 instances.
1179Exceptions like the below appear in the Master and/or RegionServer logs:
1180
1181[source]
1182----
1183
1184  2009-10-19 11:52:27,030 INFO org.apache.zookeeper.ClientCnxn: Attempting
1185  connection to server ec2-174-129-15-236.compute-1.amazonaws.com/10.244.9.171:2181
1186  2009-10-19 11:52:27,032 WARN org.apache.zookeeper.ClientCnxn: Exception
1187  closing session 0x0 to sun.nio.ch.SelectionKeyImpl@656dc861
1188  java.net.ConnectException: Connection refused
1189----
1190
1191Security group policy is blocking the ZooKeeper port on a public address.
1192Use the internal EC2 host names when configuring the ZooKeeper quorum peer list.
1193
1194[[trouble.ec2.instability]]
1195=== Instability on Amazon EC2
1196
1197Questions on HBase and Amazon EC2 come up frequently on the HBase dist-list.
1198Search for old threads using link:http://search-hadoop.com/[Search Hadoop]
1199
1200[[trouble.ec2.connection]]
1201=== Remote Java Connection into EC2 Cluster Not Working
1202
1203See Andrew's answer here, up on the user list: link:http://search-hadoop.com/m/sPdqNFAwyg2[Remote Java client connection into EC2 instance].
1204
1205[[trouble.versions]]
1206== HBase and Hadoop version issues
1207
1208[[trouble.versions.205]]
1209=== `NoClassDefFoundError` when trying to run 0.90.x on hadoop-0.20.205.x (or hadoop-1.0.x)
1210
1211Apache HBase 0.90.x does not ship with hadoop-0.20.205.x, etc.
1212To make it run, you need to replace the hadoop jars that Apache HBase shipped with in its _lib_ directory with those of the Hadoop you want to run HBase on.
1213If even after replacing Hadoop jars you get the below exception:
1214
1215[source]
1216----
1217
1218sv4r6s38: Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/configuration/Configuration
1219sv4r6s38:       at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<init>(DefaultMetricsSystem.java:37)
1220sv4r6s38:       at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.<clinit>(DefaultMetricsSystem.java:34)
1221sv4r6s38:       at org.apache.hadoop.security.UgiInstrumentation.create(UgiInstrumentation.java:51)
1222sv4r6s38:       at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:209)
1223sv4r6s38:       at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:177)
1224sv4r6s38:       at org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:229)
1225sv4r6s38:       at org.apache.hadoop.security.KerberosName.<clinit>(KerberosName.java:83)
1226sv4r6s38:       at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:202)
1227sv4r6s38:       at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:177)
1228----
1229
1230you need to copy under _hbase/lib_, the _commons-configuration-X.jar_ you find in your Hadoop's _lib_ directory.
1231That should fix the above complaint.
1232
1233[[trouble.wrong.version]]
1234=== ...cannot communicate with client version...
1235
1236If you see something like the following in your logs [computeroutput]+... 2012-09-24
1237          10:20:52,168 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting
1238          shutdown. org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate
1239          with client version 4 ...+ ...are you trying to talk to an Hadoop 2.0.x from an HBase that has an Hadoop 1.0.x client? Use the HBase built against Hadoop 2.0 or rebuild your HBase passing the +-Dhadoop.profile=2.0+ attribute to Maven (See <<maven.build.hadoop>> for more).
1240
1241== IPC Configuration Conflicts with Hadoop
1242
1243If the Hadoop configuration is loaded after the HBase configuration, and you have configured custom IPC settings in both HBase and Hadoop, the Hadoop values may overwrite the HBase values.
1244There is normally no need to change these settings for HBase, so this problem is an edge case.
1245However, link:https://issues.apache.org/jira/browse/HBASE-11492[HBASE-11492] renames these settings for HBase to remove the chance of a conflict.
1246Each of the setting names have been prefixed with `hbase.`, as shown in the following table.
1247No action is required related to these changes unless you are already experiencing a conflict.
1248
1249These changes were backported to HBase 0.98.x and apply to all newer versions.
1250
1251[cols="1,1", options="header"]
1252|===
1253| Pre-0.98.x
1254| 0.98-x And Newer
1255
1256| ipc.server.listen.queue.size
1257| hbase.ipc.server.listen.queue.size
1258
1259| ipc.server.max.callqueue.size
1260| hbase.ipc.server.max.callqueue.size
1261
1262| ipc.server.callqueue.handler.factor
1263| hbase.ipc.server.callqueue.handler.factor
1264
1265| ipc.server.callqueue.read.share
1266| hbase.ipc.server.callqueue.read.share
1267
1268| ipc.server.callqueue.type
1269| hbase.ipc.server.callqueue.type
1270
1271| ipc.server.queue.max.call.delay
1272| hbase.ipc.server.queue.max.call.delay
1273
1274| ipc.server.max.callqueue.length
1275| hbase.ipc.server.max.callqueue.length
1276
1277| ipc.server.read.threadpool.size
1278| hbase.ipc.server.read.threadpool.size
1279
1280| ipc.server.tcpkeepalive
1281| hbase.ipc.server.tcpkeepalive
1282
1283| ipc.server.tcpnodelay
1284| hbase.ipc.server.tcpnodelay
1285
1286| ipc.client.call.purge.timeout
1287| hbase.ipc.client.call.purge.timeout
1288
1289| ipc.client.connection.maxidletime
1290| hbase.ipc.client.connection.maxidletime
1291
1292| ipc.client.idlethreshold
1293| hbase.ipc.client.idlethreshold
1294
1295| ipc.client.kill.max
1296| hbase.ipc.client.kill.max
1297
1298| ipc.server.scan.vtime.weight
1299| hbase.ipc.server.scan.vtime.weight
1300|===
1301
1302== HBase and HDFS
1303
1304General configuration guidance for Apache HDFS is out of the scope of this guide.
1305Refer to the documentation available at http://hadoop.apache.org/ for extensive information about configuring HDFS.
1306This section deals with HDFS in terms of HBase.
1307
1308In most cases, HBase stores its data in Apache HDFS.
1309This includes the HFiles containing the data, as well as the write-ahead logs (WALs) which store data before it is written to the HFiles and protect against RegionServer crashes.
1310HDFS provides reliability and protection to data in HBase because it is distributed.
1311To operate with the most efficiency, HBase needs data to be available locally.
1312Therefore, it is a good practice to run an HDFS DataNode on each RegionServer.
1313
1314.Important Information and Guidelines for HBase and HDFS
1315
1316HBase is a client of HDFS.::
1317  HBase is an HDFS client, using the HDFS `DFSClient` class, and references to this class appear in HBase logs with other HDFS client log messages.
1318
1319Configuration is necessary in multiple places.::
1320  Some HDFS configurations relating to HBase need to be done at the HDFS (server) side.
1321  Others must be done within HBase (at the client side). Other settings need to be set at both the server and client side.
1322
1323Write errors which affect HBase may be logged in the HDFS logs rather than HBase logs.::
1324  When writing, HDFS pipelines communications from one DataNode to another.
1325  HBase communicates to both the HDFS NameNode and DataNode, using the HDFS client classes.
1326  Communication problems between DataNodes are logged in the HDFS logs, not the HBase logs.
1327
1328HBase communicates with HDFS using two different ports.::
1329  HBase communicates with DataNodes using the `ipc.Client` interface and the `DataNode` class.
1330  References to these will appear in HBase logs.
1331  Each of these communication channels use a different port (50010 and 50020 by default). The ports are configured in the HDFS configuration, via the `dfs.datanode.address` and `dfs.datanode.ipc.address`            parameters.
1332
1333Errors may be logged in HBase, HDFS, or both.::
1334  When troubleshooting HDFS issues in HBase, check logs in both places for errors.
1335
1336HDFS takes a while to mark a node as dead. You can configure HDFS to avoid using stale DataNodes.::
1337  By default, HDFS does not mark a node as dead until it is unreachable for 630 seconds.
1338  In Hadoop 1.1 and Hadoop 2.x, this can be alleviated by enabling checks for stale DataNodes, though this check is disabled by default.
1339  You can enable the check for reads and writes separately, via `dfs.namenode.avoid.read.stale.datanode` and `dfs.namenode.avoid.write.stale.datanode settings`.
1340  A stale DataNode is one that has not been reachable for `dfs.namenode.stale.datanode.interval` (default is 30 seconds). Stale datanodes are avoided, and marked as the last possible target for a read or write operation.
1341  For configuration details, see the HDFS documentation.
1342
1343Settings for HDFS retries and timeouts are important to HBase.::
1344  You can configure settings for various retries and timeouts.
1345  Always refer to the HDFS documentation for current recommendations and defaults.
1346  Some of the settings important to HBase are listed here.
1347  Defaults are current as of Hadoop 2.3.
1348  Check the Hadoop documentation for the most current values and recommendations.
1349
1350The HBase Balancer and HDFS Balancer are incompatible::
1351  The HDFS balancer attempts to spread HDFS blocks evenly among DataNodes. HBase relies
1352  on compactions to restore locality after a region split or failure. These two types
1353  of balancing do not work well together.
1354+
1355In the past, the generally accepted advice was to turn off the HDFS load balancer and rely
1356on the HBase balancer, since the HDFS balancer would degrade locality. This advice
1357is still valid if your HDFS version is lower than 2.7.1.
1358+
1359link:https://issues.apache.org/jira/browse/HDFS-6133[HDFS-6133] provides the ability
1360to exclude a given directory from the HDFS load balancer, by setting the
1361`dfs.datanode.block-pinning.enabled` property to `true` in your HDFS
1362configuration and running the following hdfs command:
1363+
1364----
1365$ sudo -u hdfs hdfs balancer -exclude /hbase
1366----
1367+
1368NOTE: HDFS-6133 is available in HDFS 2.7.0 and higher, but HBase does not support
1369running on HDFS 2.7.0, so you must be using HDFS 2.7.1 or higher to use this feature
1370with HBase.
1371
1372.Connection Timeouts
1373Connection timeouts occur between the client (HBASE) and the HDFS DataNode.
1374They may occur when establishing a connection, attempting to read, or attempting to write.
1375The two settings below are used in combination, and affect connections between the DFSClient and the DataNode, the ipc.cClient and the DataNode, and communication between two DataNodes.
1376
1377`dfs.client.socket-timeout` (default: 60000)::
1378  The amount of time before a client connection times out when establishing a connection or reading.
1379  The value is expressed in milliseconds, so the default is 60 seconds.
1380
1381`dfs.datanode.socket.write.timeout` (default: 480000)::
1382  The amount of time before a write operation times out.
1383  The default is 8 minutes, expressed as milliseconds.
1384
1385.Typical Error Logs
1386The following types of errors are often seen in the logs.
1387
1388`INFO HDFS.DFSClient: Failed to connect to /xxx50010, add to deadNodes and
1389            continue java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel
1390            to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending
1391            remote=/region-server-1:50010]`::
1392  All DataNodes for a block are dead, and recovery is not possible.
1393  Here is the sequence of events that leads to this error:
1394
1395`INFO org.apache.hadoop.HDFS.DFSClient: Exception in createBlockOutputStream
1396            java.net.SocketTimeoutException: 69000 millis timeout while waiting for channel to be
1397            ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/
1398            xxx:50010]`::
1399  This type of error indicates a write issue.
1400  In this case, the master wants to split the log.
1401  It does not have a local DataNodes so it tries to connect to a remote DataNode, but the DataNode is dead.
1402
1403[[trouble.tests]]
1404== Running unit or integration tests
1405
1406[[trouble.hdfs_2556]]
1407=== Runtime exceptions from MiniDFSCluster when running tests
1408
1409If you see something like the following
1410
1411[source]
1412----
1413...
1414java.lang.NullPointerException: null
1415at org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes
1416at org.apache.hadoop.hdfs.MiniDFSCluster.<init>
1417at org.apache.hadoop.hbase.MiniHBaseCluster.<init>
1418at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniDFSCluster
1419at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster
1420...
1421----
1422
1423or
1424
1425[source]
1426----
1427...
1428java.io.IOException: Shutting down
1429at org.apache.hadoop.hbase.MiniHBaseCluster.init
1430at org.apache.hadoop.hbase.MiniHBaseCluster.<init>
1431at org.apache.hadoop.hbase.MiniHBaseCluster.<init>
1432at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster
1433at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster
1434...
1435----
1436
1437\... then try issuing the command +umask 022+ before launching tests.
1438This is a workaround for link:https://issues.apache.org/jira/browse/HDFS-2556[HDFS-2556]
1439
1440[[trouble.casestudy]]
1441== Case Studies
1442
1443For Performance and Troubleshooting Case Studies, see <<casestudies>>.
1444
1445[[trouble.crypto]]
1446== Cryptographic Features
1447
1448[[trouble.crypto.hbase_10132]]
1449=== sun.security.pkcs11.wrapper.PKCS11Exception: CKR_ARGUMENTS_BAD
1450
1451This problem manifests as exceptions ultimately caused by:
1452
1453[source]
1454----
1455Caused by: sun.security.pkcs11.wrapper.PKCS11Exception: CKR_ARGUMENTS_BAD
1456  at sun.security.pkcs11.wrapper.PKCS11.C_DecryptUpdate(Native Method)
1457  at sun.security.pkcs11.P11Cipher.implDoFinal(P11Cipher.java:795)
1458----
1459
1460This problem appears to affect some versions of OpenJDK 7 shipped by some Linux vendors.
1461NSS is configured as the default provider.
1462If the host has an x86_64 architecture, depending on if the vendor packages contain the defect, the NSS provider will not function correctly.
1463
1464To work around this problem, find the JRE home directory and edit the file _lib/security/java.security_.
1465Edit the file to comment out the line:
1466
1467[source]
1468----
1469security.provider.1=sun.security.pkcs11.SunPKCS11 ${java.home}/lib/security/nss.cfg
1470----
1471
1472Then renumber the remaining providers accordingly.
1473
1474== Operating System Specific Issues
1475
1476=== Page Allocation Failure
1477
1478NOTE: This issue is known to affect CentOS 6.2 and possibly CentOS 6.5.
1479It may also affect some versions of Red Hat Enterprise Linux, according to https://bugzilla.redhat.com/show_bug.cgi?id=770545.
1480
1481Some users have reported seeing the following error:
1482
1483----
1484kernel: java: page allocation failure. order:4, mode:0x20
1485----
1486
1487Raising the value of `min_free_kbytes` was reported to fix this problem.
1488This parameter is set to a percentage of the amount of RAM on your system, and is described in more detail at http://www.centos.org/docs/5/html/5.1/Deployment_Guide/s3-proc-sys-vm.html.
1489
1490To find the current value on your system, run the following command:
1491
1492----
1493[user@host]# cat /proc/sys/vm/min_free_kbytes
1494----
1495
1496Next, raise the value.
1497Try doubling, then quadrupling the value.
1498Note that setting the value too low or too high could have detrimental effects on your system.
1499Consult your operating system vendor for specific recommendations.
1500
1501Use the following command to modify the value of `min_free_kbytes`, substituting _<value>_ with your intended value:
1502
1503----
1504[user@host]# echo <value> > /proc/sys/vm/min_free_kbytes
1505----
1506
1507== JDK Issues
1508
1509=== NoSuchMethodError: java.util.concurrent.ConcurrentHashMap.keySet
1510
1511If you see this in your logs:
1512[source]
1513----
1514Caused by: java.lang.NoSuchMethodError: java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView;
1515  at org.apache.hadoop.hbase.master.ServerManager.findServerWithSameHostnamePortWithLock(ServerManager.java:393)
1516  at org.apache.hadoop.hbase.master.ServerManager.checkAndRecordNewServer(ServerManager.java:307)
1517  at org.apache.hadoop.hbase.master.ServerManager.regionServerStartup(ServerManager.java:244)
1518  at org.apache.hadoop.hbase.master.MasterRpcServices.regionServerStartup(MasterRpcServices.java:304)
1519  at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:7910)
1520  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2020)
1521  ... 4 more
1522----
1523then check if you compiled with jdk8 and tried to run it on jdk7.
1524If so, this won't work.
1525Run on jdk8 or recompile with jdk7.
1526See link:https://issues.apache.org/jira/browse/HBASE-10607[HBASE-10607 JDK8 NoSuchMethodError involving ConcurrentHashMap.keySet if running on JRE 7].
1527