鉴于我有Spark
功能:
val group = whereRdd.map(collection => collection.getLong("location_id") -> collection.getInt("feel"))
.groupByKey
.map(grouped => grouped._1 -> grouped._2.toSet)
group.foreach(g => println(g))
我得到了:
(639461796080961,Set(15))
(214680441881239,Set(5, 10, 25, -99, 99, 19, 100))
(203328349712668,Set(5, 10, 15, -99, 99, 15, 10))
是否可以向此功能添加Map()
,并在每个集合中添加avg
和sum
?例如:
(639461796080961,Map("data" -> Set(5, 10, 25, -99, 99, 19, 100), "avg" -> 22.71, "sum" -> 159))
答案 0 :(得分:2)
我建议使用Tuple
或案例类而不是Map
。我的意思大致是这样的:
case class Location(id: Long, values: Set[Int], sum: Int, avg: Double)
val group = whereRdd
.map(collection =>
collection.getLong("location_id") -> collection.getInt("feel"))
.groupByKey
.map{case (id, values) => {
val set = values.toSet
val sum = set.sum
val mean = sum / set.size.toDouble
Location(id, set, sum, mean)
}}
优于Map
的最大优势在于它使类型保持有序。
答案 1 :(得分:1)
在阅读@ zero323回答后,我添加了Actions
并且它有效:
Actions actions = new Actions(driver);
actions.moveToElement(element);
actions.click();
actions.sendKeys("SOME DATA");
actions.build().perform();
我得到了:
Map()