从Scala中的WrappedArray检索数据

时间:2017-07-16 14:50:36

标签: scala apache-spark

我有以下简单的程序,我不知道如何在Scala中读取数组内的值。

val all_marks = Result.groupBy("class", "school").agg(collect_list("mark") as "marks",count("*") as "cnt").where($"cnt" > 10)

var mrk=all_marks.collect().map(mark=>""+mark(2))

结果如下所示:

mrk: Array[String] = Array(WrappedArray(52.0, 18.0, 17.0, 36.0, 22.0, 22.0), WrappedArray(49.0, 53.0, 41.0, 30.0, 48.0, 36.0))

我需要迭代(mrk)数组以分别读取每个WrappedArray,以便对每个WrappedArray中的每个标记进行进一步的数学计算。如何以简单的方式阅读每个WrappedArray。

1 个答案:

答案 0 :(得分:0)

你需要用

替换var mrk = all_marks.collect()。map(mark =>"" + mark(2))
val mrk=all.select("marks")

然后将数据帧转换为rdd(列表),然后再转换回dataframe

toRDD=mrk.rdd.map(_.getList[Int](0).toList).toDF("marks")

然后定义UDF

 var i=0
    var read_row_by_row=""
//define udf
    val createUdf = udf((list: Seq[Int]) => {
      val ascending = list.sorted  //sorts in ascending order
//in this loop you can add whatever you like of calculations      
for (i <- 0 to ascending.size - 1){
      read_row_by_row=read_row_by_row+","+ascending(i)
      }

      s"${read_row_by_row}"
    })
    val g =ag_two.withColumn("mark", createUdf($"marks"))
    g.show
+--------------------+
|               marks|
+--------------------+
|,17,17,17,17,18,1...|
|,18,18,18,18,19,1...|
|,18,23,24,24,24,2...|
|,18,23,24,24,24,2...|
|,17,18,18,18,18,1...|
|,25,35,36,39,41,4...|
|,25,35,36,39,41,4...|
|,31,31,33,33,33,3...|