Scala,将列表映射到稀疏向量

时间:2020-02-10 14:54:32

标签: scala apache-spark rdd

我的RDD为

Array[(String, Iterable[(Int, Double)])]

它的元素看起来像

(000267537-01,List((25,0.01), (35,120.0), (26,2.0), (38,130.0), (21,45.0), (54,180.0), (39,10.0)))

现在具有56的常数,我想将RDD的List部分转换为稀疏向量。所以我做到了:

val my_rslt = my_rdd.map(x => (x._1, Vectors.sparse(56, x._2)))

然后我收到一条错误消息:

<console>:37: error: overloaded method value sparse with alternatives:
  (size: Int,elements: java.lang.Iterable[(Integer, java.lang.Double)])org.apache.spark.mllib.linalg.Vector <and>
  (size: Int,elements: Seq[(Int, scala.Double)])org.apache.spark.mllib.linalg.Vector
 cannot be applied to (Int, Iterable[(Int, scala.Double)])
       val my_rslt = my_rdd.map(x => (x._1, Vectors.sparse(56, x._2)))
                                                    ^

那么这里出了什么问题? “ Vectors.sparse”如何工作?

1 个答案:

答案 0 :(得分:0)

尝试了很多次之后,我发现我应该这样做:

val my_rslt = my_rdd.map(x => (x._1, Vectors.sparse(56, x._2.toList)))

尽管我仍然对将“ toList”应用于列表的效果感到困惑。