从scala spark中的Array [Any]到Array [Double]

时间:2019-04-03 12:12:53

标签: scala apache-spark

对于此代码:

var p = result.select("finalFeatures").head.toSeq.toArray

结果如下:

p: Array[Any] = Array([3.0,6.0,-0.7876947819954485,-0.21757635218517163,0.9731844373162398,-0.6641741696340382,-0.6860072219935377,-0.2990737363481845,-0.7075863760365155,0.8188108975549018,-0.8468559840943759,-0.04349947247406488,-0.45236764452589984,1.0333959313820456,0.609756607087835,-0.7106619551471779,-0.7750330808435969,-0.08097610412658443,-0.45338437108038904,-0.2952869863393396,-0.30959772365257004,0.6988768123463287,0.17049117199049213,3.2674649019757385,-0.8333373234944124,1.8462942520757128,-0.49441222531240125,-0.44187299748074166,-0.300810826687287])

我需要这个为Array[Double] 我该怎么办?

2 个答案:

答案 0 :(得分:1)

您可以将Array中的Any转换为Double,如下所示:

 val pAsDouble = p.map(_.toString.toDouble)

答案 1 :(得分:1)

假设您具有以下数据:

val df = Seq(Array("2.3", "2.0", "5")).toDF("finalFeatures")
df.show

上一个命令的输出将是:

+-------------+
|finalFeatures|
+-------------+
|[2.3, 2.0, 5]|
+-------------+

df.schema将打印org.apache.spark.sql.types.StructType = StructType(StructField(finalFeatures,ArrayType(StringType,true),true))以将列转换为您可以执行的双数组操作:

val doubleSeq = df.select($"finalFeatures".cast("array<double>")).head.get(0).asInstanceOf[Seq[Double]]

并且doubleSeq.foreach(println _)应该具有下一个输出:

2.3
2.0
5.0