我试图在featureD
函数中将Vectors.dense
添加为Double的数组,但出现此错误:
templates/scala-parallel-classification/reading-custom-properties/src/main/scala/DataSource.scala:58:21: overloaded method value dense with alternatives:
[INFO] [Engine$] [error] (values: Array[Double])org.apache.spark.mllib.linalg.Vector <and>
[INFO] [Engine$] [error] (firstValue: Double,otherValues: Double*)org.apache.spark.mllib.linalg.Vector
[INFO] [Engine$] [error] cannot be applied to (Array[Any])
[INFO] [Engine$] [error] Vectors.dense(Array(
这是我的代码:
required = Some(List( // MODIFIED
"featureA", "featureB", "featureC", "featureD", "label")))(sc)
// aggregateProperties() returns RDD pair of
// entity ID and its aggregated properties
.map { case (entityId, properties) =>
try {
// MODIFIED
LabeledPoint(properties.get[Double]("label"),
Vectors.dense(Array(
properties.get[Double]("featureA"),
properties.get[Double]("featureB"),
properties.get[Double]("featureC"),
properties.get[Array[Double]]("featureD")
))
)
} catch {
case e: Exception => {
logger.error(s"Failed to get properties ${properties} of" +
s" ${entityId}. Exception: ${e}.")
throw e
}
}
如何在Vectors.dense
函数数组中传递数组?
答案 0 :(得分:0)
Vectors.dense
仅接受单个Array[Double]
或将double用作单独的参数。数组中不能有数组。由于数组具有混合类型,您会收到错误消息:
不能应用于(Array [Any])
要解决此问题,解决方案是简单地使用第二个数组扩展数组,而不是将其添加为单个元素。在这种情况下,请将LabeledPoint
的创建更改为:
LabeledPoint(properties.get[Double]("label"),
Vectors.dense(
Array(
properties.get[Double]("featureA"),
properties.get[Double]("featureB"),
properties.get[Double]("featureC")
) ++ properties.get[Array[Double]]("featureD")
)
)