保存JavaDStream <list <string>&gt;作为火花流动的镶木地板

时间:2017-07-17 03:20:17

标签: apache-spark apache-spark-sql spark-streaming spark-dataframe

我有以下代码,

JavaDStream<List<String>> records = messages.map(new Function<Tuple2<String,String>, List<String>>() {
        private static final long serialVersionUID = 1L;
        List<String> splitJsons = new ArrayList<String>();
        @Override
        public List<String> call(Tuple2<String, String> tuple2) throws Exception {
            splitJsons = buildResultMap(tuple2._2());
            return splitJsons;
        }

    });

现在,我需要使用Java Dataframe API将“记录”保存为镶木地板。 我试过如下,

records.foreachRDD(new VoidFunction2<JavaRDD<List<String>>, Time>() {
        private static final long serialVersionUID = 1L;

        @Override
        public void call(JavaRDD<List<String>> rddList, Time time) throws Exception {
            sqlContext = SQLContextSingleton.getInstance(rddList.context());
            DataFrame wordsDataFrame = rddList.flatMap(w => Record(w)).toDF();
            wordsDataFrame.write().mode("Append").parquet("/tmp/parquet");

        }

    });

但是,我无法将JavaDStream<List<String>转换为Dataframe。我似乎必须将JavaDStream<List<String>>转换为JavaDStream<String>,然后将其转换为Dataframe。

0 个答案:

没有答案