这不是重复的问题
我正在Windows 10上的Java中处理spark应用程序。
对于我的任务,我需要从目录中读取多个txt文件。所以我用
WholeTextFiles("C:\Users\Desktop\input")
功能到真实目录。
但它给出了如下错误。
Exception in thread "main" java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: (null) entry in command string: null ls -F C:\Users\keyur\Desktop\input\file1.txt
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:762)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:859)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:842)
at org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:1097)
at org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:587)
at org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:562)
at org.apache.hadoop.fs.LocatedFileStatus.<init>(LocatedFileStatus.java:47)
at org.apache.hadoop.fs.FileSystem$4.next(FileSystem.java:1701)
at org.apache.hadoop.fs.FileSystem$4.next(FileSystem.java:1681)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:303)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:264)
at org.apache.spark.input.WholeTextFileInputFormat.setMinPartitions(WholeTextFileInputFormat.scala:52)
at org.apache.spark.rdd.WholeTextFileRDD.getPartitions(WholeTextFileRDD.scala:54)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2099)
at org.apache.spark.rdd.RDD.count(RDD.scala:1168)
at org.apache.spark.api.java.JavaRDDLike$class.count(JavaRDDLike.scala:455)
at org.apache.spark.api.java.AbstractJavaRDDLike.count(JavaRDDLike.scala:45)
at sparkDemo.sparkDemo.WordCounter.main(WordCounter.java:29)
,但是相同的代码可用于一个文件,例如“ C:\ user \ desktop \ input \ file1.txt”。 但对于目录,它给出
java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: (null) entry in command string: null ls -F C:\Users\keyur\Desktop\input\file1.txt
源代码:
public static void main( String[] args )
{
System.out.println( "Hello World!" );
SparkConf conf = new SparkConf().setAppName("WordCounter").setMaster("local[*]");
JavaSparkContext sc = new JavaSparkContext(conf);
String path = "C:\\Users\\keyur\\Desktop\\input";
// String path = args[0];
System.out.println("Trying to open: " + path);
JavaPairRDD<String, String> lines = sc.wholeTextFiles(path.toString());
System.out.println("Lines count: " + lines.count());
lines.saveAsTextFile("C:\\Users\\keyur\\OneDrive\\Desktop\\op10");
//lines.saveAsTextFile(args[1]);
sc.stop();
}