在Spark中读取目录的内容

时间:2018-09-16 09:06:51

标签: java apache-spark

我是刚刚起步的新手,只是想知道我们如何读取目录内容并对其进行遍历。 C#对应代码

Foreach(var path in Directory. EnumerateFiles(directory, *,.. ) {} ;

1 个答案:

答案 0 :(得分:1)

  JavaSparkContext jsc = new JavaSparkContext(sc);
  JavaPairRDD<String,String> rdd = jsc.wholeTextFiles(path);
          for(Tuple2<String, String> str : rdd.toArray()) {           System.out.println("+++++++++++++++++++++++++++++++++++++++++++");
      System.out.println("File name " + str._1);
      System.out.println("+++++++++++++++++++++++++++++++++++++++++++");
      System.out.println();
      System.out.println("-------------------------------------------");
      System.out.println("content " + str._2);
      System.out.println("-------------------------------------------");
  }

希望有帮助,我有同样的问题。