HDFS文件未关闭导致锁定异常排除


1.检查失败任务

java.io.IOException: Cannot obtain block length for LocatedBlock

问题 集群重启 文件还在打开状态

问题解决

[hdpusr@prdclt api]$ hadoop fsck -openforwrite /raw/rawdata/nginxlog/api/2017/12/20

DEPRECATED: Use of this script to execute hdfs command is deprecated.

Instead use the hdfs command for it.



Connecting to namenode via http://bdm:50070

FSCK started by hdpusr (auth:SIMPLE) from /10.29.229.199 for path /raw/rawdata/nginxlog/api/2017/12/20 at Thu Dec 21 09:37:23 CST 2017

......./raw/rawdata/nginxlog/api/2017/12/20/rawlog.1513710668740 101 bytes, 1 block(s), OPENFORWRITE: /raw/rawdata/nginxlog/api/2017/12/20/rawlog.1513714086622 101 bytes, 1 block(s), OPENFORWRITE: ..../raw/rawdata/nginxlog/api/2017/12/20/rawlog.1513721803603 202 bytes, 1 block(s), OPENFORWRITE: ....../raw/rawdata/nginxlog/api/2017/12/20/rawlog.1513736107249 618 bytes, 1 block(s), OPENFORWRITE: /raw/rawdata/nginxlog/api/2017/12/20/rawlog.1513736627780 543 bytes, 1 block(s), OPENFORWRITE: ./raw/rawdata/nginxlog/api/2017/12/20/rawlog.1513740257836 226 bytes, 1 block(s), OPENFORWRITE: ./raw/rawdata/nginxlog/api/2017/12/20/rawlog.1513743887955 680 bytes, 1 block(s), OPENFORWRITE: ./raw/rawdata/nginxlog/api/2017/12/20/rawlog.1513747513638 987 bytes, 1 block(s), OPENFORWRITE: ./raw/rawdata/nginxlog/api/2017/12/20/rawlog.1513751179039 359 bytes, 1 block(s), OPENFORWRITE: /raw/rawdata/nginxlog/api/2017/12/20/rawlog.1513754790596 364 bytes, 1 block(s), OPENFORWRITE: 

/raw/rawdata/nginxlog/api/2017/12/20/rawlog.1513754790596: MISSING 1 blocks of total size 364 B./raw/rawdata/nginxlog/api/2017/12/20/rawlog.1513754946865 124 bytes, 1 block(s), OPENFORWRITE: 

/raw/rawdata/nginxlog/api/2017/12/20/rawlog.1513754946865: MISSING 1 blocks of total size 124 B.................Status: CORRUPT

 Total size:    875420 B

 Total dirs:    1

 Total files:    48

 Total symlinks:        0

 Total blocks (validated):    48 (avg. block size 18237 B)

 ********************************

 CORRUPT FILES:    2

 MISSING BLOCKS:    2

 MISSING SIZE:        488 B

 ********************************

 Minimally replicated blocks:    46 (95.833336 %)

 Over-replicated blocks:    0 (0.0 %)

 Under-replicated blocks:    0 (0.0 %)

 Mis-replicated blocks:        0 (0.0 %)

 Default replication factor:    2

 Average block replication:    2.875

 Corrupt blocks:        0

 Missing replicas:        0 (0.0 %)

 Number of data-nodes:        7

 Number of racks:        1

FSCK ended at Thu Dec 21 09:37:23 CST 2017 in 1 milliseconds


The filesystem under path '/raw/rawdata/nginxlog/api/2017/12/20' is CORRUPT

删除丢失块的文件即可

2. 如何快速找出丢失块的文件呢?

hadoop fsck /raw/rawdata/nginxlog/api/rawlog/2017/12/20 -openforwrite | egrep -v '^\.+$' | egrep "MISSING" | grep -o "/[^ ]*"| grep -o ".*:" |grep -o "/[^:]*"

DEPRECATED: Use of this script to execute hdfs command is deprecated.

Instead use the hdfs command for it.



Connecting to namenode via http://bdm:50070

/raw/rawdata/nginxlog/api/rawlog/2017/12/20/rawlog.1513699205174

/raw/rawdata/nginxlog/api/rawlog/2017/12/20/rawlog.1513746751995

/raw/rawdata/nginxlog/api/rawlog/2017/12/20/rawlog.1513750284730

/raw/rawdata/nginxlog/api/rawlog/2017/12/20/rawlog.1513752243079

文章作者: hnbian
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 hnbian !
评论
 上一篇
人工智能在线特征系统中的数据存取技术 人工智能在线特征系统中的数据存取技术
转载美团文档,原文地址为:https://tech.meituan.com/online-feature-system.html 1. 在线特征系统主流互联网产品中,不论是经典的计算广告、搜索、推荐,还是垂直领域的路径规划、司机派单、物料智
2018-03-25
下一篇 
机器学习中的数据清洗与特征处理综述 机器学习中的数据清洗与特征处理综述
转载美团文档,原文地址为: https://tech.meituan.com/machinelearning-data-feature-process.html 1. 背景随着美团交易规模的逐步增大,积累下来的业务数据和交易数据越来越多,这
2018-03-21
  目录