Scala Spark中createDataFrame出错

时间:2017-01-20 10:20:10

标签: scala apache-spark spark-dataframe rdd

我有一个文件,其(x,y,z)坐标为对象的10帧。当我创建数据框时,我收到错误。代码如下。

 val schema_each_atom = Row(Seq(
 StructField("x", DoubleType, false),
 StructField("y", DoubleType, false),
 StructField("z", DoubleType, false)))

 var file_text = sc.textFile(file_name)
 var header = file_text.first() 
 file_text = file_text.filter(row => row != header)

 var temp_data = file_text.map(s => regex.findAllMatchIn(s).
     map(_.matched.toDouble).toList).collect.toList.flatten.toList
 var frames_from_file = temp_data.grouped(10).toList

 frames_from_file.foreach { x => 
   var each_atom_coord = x.grouped(number_of_atoms).toList.map(x => x.toSeq)

   var rdd_each_atom_coord = sc.makeRDD(each_atom_coord)
   var frame = sqlContext.createDataFrame(rdd_each_atom_coord, schema_each_atom)
 }

最后一行给出错误。

0 个答案:

没有答案