Could not locate executable null\bin\winutils.exe in the Hadoop binaries


1. 背景

在windows环境下运行hadoop或者Spark相关程序经常能遇见下面这样的问题:

java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
    at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:378)
    at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:393)
    at org.apache.hadoop.util.Shell.<clinit>(Shell.java:386)
    at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)
    at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:116)
    at org.apache.hadoop.security.Groups.<init>(Groups.java:93)
    at org.apache.hadoop.security.Groups.<init>(Groups.java:73)
    at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:293)
    at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:283)
    at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:260)
    at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:789)
    at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:774)
    at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:647)
    at org.apache.spark.util.Utils$.$anonfun$getCurrentUserName$1(Utils.scala:2422)
    at scala.Option.getOrElse(Option.scala:138)
    at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2422)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:293)
    at hnbian.spark.utils.SparkUtils$.getSparkSession(SparkUtils.scala:19)
    at hnbian.sparkml.classnews.LogisticRegressionDemo$.delayedEndpoint$hnbian$sparkml$classnews$LogisticRegressionDemo$1(LogisticRegressionDemo.scala:17)
    at hnbian.sparkml.classnews.LogisticRegressionDemo$delayedInit$body.apply(LogisticRegressionDemo.scala:13)
    at scala.Function0.apply$mcV$sp(Function0.scala:39)
    at scala.Function0.apply$mcV$sp$(Function0.scala:39)
    at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
    at scala.App.$anonfun$main$1$adapted(App.scala:80)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at scala.App.main(App.scala:80)
    at scala.App.main$(App.scala:78)
    at hnbian.sparkml.classnews.LogisticRegressionDemo$.main(LogisticRegressionDemo.scala:13)

虽然它可能不会影响我们程序执行的结果,但是有一个错误在控制台打印也会感觉到不爽,

2. 解决办法

下面给你提供解决办法供参考
先去 https://github.com/steveloughran/winutils 这个地址把需要的文件下载下来并解压
在代码中配置 hadoop.home.dir 即可。如:

System.setProperty("hadoop.home.dir", "D:\\ProgramFiles\\winutils-master\\hadoop-2.7.1")

3. 其他问题

在写入数据的时候遇见了类似下面这个错误,解决办法也是将上面这个hadoop.home.dir 配置好了就解决了

java IOException: (null) entry in command string: null chmod 0644 ![]()


文章作者: hnbian
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 hnbian !
评论
 上一篇
关于SparkSQL 2.4 ArrayIndexOutOfBoundsException error 的问题记录 关于SparkSQL 2.4 ArrayIndexOutOfBoundsException error 的问题记录
1. 背景测试spark ml 代码的时候报了一个数组下标越界的异常,报错日志如下: 19/11/15 11:02:21 ERROR Instrumentation: com.thoughtworks.paranamer.BytecodeR
2019-11-15
下一篇 
Spark-源码在idea-下添加注释报错的问题 Spark-源码在idea-下添加注释报错的问题
1. 问题Spark源码用IDEA导入后,添加注释时,发现不管是 单行注释符 // 还是 多行注释符 /**/,都会报错。 Scalastyle examines your Scala code and indicates potenti
2019-10-10
  目录