为什么Spark java驱动程序会强制我安装winutils.exe,它被认为是Hadoop二进制文件,如果我想在本地非HDFS文件系统上以独立模式运行它?
import org.apache.spark.api.java.*;
import org.apache.spark.SparkConf;
public class Main {
public static void main(String[] args) {
String testFile="test.iml";
SparkConf conf=new SparkConf().setAppName("Test application").setMaster("local[*]");
JavaSparkContext sparkContext=new JavaSparkContext(conf);
JavaRDD<String> logData=sparkContext.textFile(testFile).cache();
long numAs=logData.filter((s)->s.contains("a")).count();
long numBs= logData.filter((s)->s.contains("b")).count();
System.out.println(numBs);
System.out.println(numAs);
}
}
错误消息
ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.