org.apache.hadoop.mapred.InvalidInputException:输入路径不存在:Windows上的错误

时间:2018-09-12 16:43:29

标签: apache-spark rdd

我正在Windows计算机上运行Spark。我是一个初学者,从tsv文件创建RDD时遇到了这个问题。

    scala> val fileRDD= sc.textFile("D:/work/testdata/test.tsv")
fileRDD: org.apache.spark.rdd.RDD[String] = D:/work/testdata/test.tsv MapPartitionsRDD[7] at textFile at <console>:24

scala> fileRDD.first()
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/D:/work/testdata/test.tsv
  at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287)
  at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)
  at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
  at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
  at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
  at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1337)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
  at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
  at org.apache.spark.rdd.RDD.take(RDD.scala:1331)
  at org.apache.spark.rdd.RDD$$anonfun$first$1.apply(RDD.scala:1372)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
  at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
  at org.apache.spark.rdd.RDD.first(RDD.scala:1371)
  ... 49 elided

文件存在于此处,我没有使用Hadoop,而是在本地模式下运行Spark。任何帮助将不胜感激。

0 个答案:

没有答案