将数组数组的rdd保存到文本文件spark

时间:2016-02-23 01:48:54

标签: arrays list scala apache-spark rdd

我有一个叫做tmp的RDD。

"org.apache.spark.rdd.RDD[(String, List[(String, String, Double)])]" 

,值如下所示。

Array[(String, List[(String, String, Double)])] = Array((1076486,List((1076486,1076486,0.0), (1076486,431000,0.7438727490345501), (1076486,351632,3.139055446043724), (1076486,431611,6.173095256463185))), (430067,List((430067,430067,0.0), (430067,1037380,4.0390818750047535), (430067,431611,6.396930255172381), (430067,824889,7.265222659014164))))

我的输出应该是列表的内部内容,如下所示......

1076486,1076486,0.0
1076486,431000,0.7438727490345501
.
.
430067,1037380,4.0390818750047535

我试过了..

.mapValues(_.toList).saveAsTextFile

在文件中显示如下。

(1076486,List((1076486,1076486,0.0), (1076486,431000,0.7438727490345501), (1076486,351632,3.139055446043724), (1076486,431611,6.173095256463185)))
(430067,List((430067,430067,0.0), (430067,1037380,4.0390818750047535), (430067,431611,6.396930255172381), (430067,824889,7.265222659014164)))

我可以通过以下代码打印所需的数据

tmp.collect().foreach(a=> {a.foreach(e=>print(e+" "))})

但无法将其保存到文件中。

如何获得所需的结果?

1 个答案:

答案 0 :(得分:4)

只需手动创建输出字符串:

import java.util.Scanner;

public class MP {
    public static void main(String[] args) {
        Scanner input = new Scanner(System.in);
        System.out.print("enter number:\n");
        String x = input.next();
        for (int i=x.length();i>0;) {
            char c=x.charAt(--i);
            if (c<='9'&&c>='0')System.out.print();
        }
        System.out.println();
    }
}