当前位置:首页 > Hadoop大数据环境搭建v1.0
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:java.compiler=
2013-09-29 11:21:03,056 [myid:2] - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:os.name=Linux
2013-09-29 11:21:03,057 [myid:2] - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:os.arch=amd64
2013-09-29 11:21:03,057 [myid:2] - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:os.version=2.6.32-358.el6.x86_64 2013-09-29 11:21:03,057 [myid:2] - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:user.name=demo
2013-09-29 11:21:03,057 [myid:2] - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:user.home=/home/demo 2013-09-29 11:21:03,058 [myid:2] - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:user.dir=/home/demo
2013-09-29 11:21:03,059 [myid:2] - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@162] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /var/zookeeper/version-2 snapdir /var/zookeeper/version-2 2013-09-29 11:21:03,059 [myid:2] - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@63] - FOLLOWING - LEADER ELECTION TOOK - 233
2013-09-29 11:21:03,066 [myid:2] - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Learner@325] - Getting a snapshot from leader 2013-09-29 11:21:03,073 [myid:2] - INFO
[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FileTxnSnapLog@270] - Snapshotting: 0x200000001 to /var/zookeeper/version-2/snapshot.200000001
如果在上述操作没有在30分钟之内恢复。请启动应急预案中的集群主备切换操作 具体操作方法请参照《HDQS-AM-004历史数据查询系统应急处理手册》
2.4.4、datanode服务故障
现象描述:syslog服务告警syslog日志上传报警信息
以下为检测到TY103-006服务器上datanode服务dead信息。
2013-10-20 10:05:02 INFO ] [com.cms.web.syslog.SyslogUtil:100] - 发送:
CEB-HDQS|+|CEB-HDQS|+|1001|+|1001|+|NA|+|TY103-006|+|检测Datanode服务状态|+|TY103-006|+|dead|+|APP|+|HDQS|+|Datanode|+|1|+|TY103-006上
Datanode服务故障|+|1382234640|+|xiaoxu|+|13810466464
确认方法:登录到hadoop界面查看节点状态,登录方法: http://10.1.242.182:50070 点击Dead nodes会显示出已经down掉的节点,红色标记处 查看datanode服务故障服务器日志信息:
2013-10-20 09:42:56,741 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_2801755526513394545_957410
2013-10-20 09:44:01,341 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_8004390811280095539_189478
2013-10-20 09:45:05,940 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_-7558433326837136207_950828
2013-10-20 09:45:05,964 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_961832263082316047_963685
2013-10-20 09:45:06,141 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_3084063903778417422_751015
2013-10-20 09:45:06,159 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_-3145692262741932109_458312
2013-10-20 09:45:19,741 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_-4141702440184562434_893773
2013-10-20 09:45:19,776 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_-3883491030799322785_505099
2013-10-20 09:46:24,341 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_-5675967604119992255_696596
2013-10-20 09:47:28,941 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_5659778062198523850_155467
2013-10-20 09:47:31,541 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_-2603884964565251091_104376
2013-10-20 09:47:31,741 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_-3260698574769839868_829758
2013-10-20 09:48:36,341 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_-7521957833443533987_935827
2013-10-20 09:48:36,366 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_-8573477859044035824_918684
2013-10-20 09:48:36,541 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_7327706406895808634_413818
2013-10-20 09:48:38,541 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_-7266731098245842836_792956
2013-10-20 09:48:38,578 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_5632772206116911879_723652
2013-10-20 09:48:38,741 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_2205969096317304795_406972
2013-10-20 09:49:07,941 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_6526864009881162179_756595
2013-10-20 09:50:12,540 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_3642034032334664011_871155
2013-10-20 09:50:12,559 INFO org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner: Verification succeeded for BP-1800471205-192.168
.0.75-1376982711635:blk_9146423954430545212_666531
2013-10-20 09:50:18,821 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
/************************************************************ SHUTDOWN_MSG: Shutting down DataNode at TY103-006/192.168.0.81 上述日志信息校验块状态,最终发现datanode服务被shutdown。 通过BDP管理界面获取dead节点服务信息:
平台首页>管理控制台>集群监控>集群服务监控
排除方法:
第一步:登录到故障节点,查看系统日志信息 命令如下: ssh snode6
cd /opt/hadoop/logs //进入到日志目录
tail -200f hadoop-hadoop-datanode-TY101-M01.log //查看最新200条日志记录 第二步:启动datanode服务 命令如下:
/opt/hadoop/sbin/hadoop-daemon.sh start datanode //开启datanode服务 例如
共分享92篇相关文档