如何减少scala中的(名称,值)列表列表,如下所示
例如
[(a,1), (b,2), (c,4)]
[(a,2), (b,3), (c,4)]
[(a,1), (b,3), (c,4)]
缩减为(名称,所有列表中的最大发生值,(出现次数/总出现次数)列表)
[(a,1,.66), (b,3,0.66), (c,4,1)]
如果它们的出现率相同,则可以是任何值。
我在下面尝试了这个。我创建了一个列表l和
l.groupBy(_._1).mapValues(_.groupBy(_._2)).mapValues(_.mapValues(_.size)).toList.map(x => (x._1,x._2.toList))
给了我
List((b,List((2,1), (3,2))), (a,List((2,1), (1,2))), (c,List((4,3))))
我认为我很接近,但很快就能得到帮助
答案 0 :(得分:1)
这是获得所需内容的一种方式:
val lol = List(
List( ("a", 1), ("b", 2), ("c", 4) ),
List( ("a", 2), ("b", 3), ("c", 4) ),
List( ("a", 1), ("b", 3), ("c", 4) )
)
val list = lol.flatten
val t1Map = list.groupBy(_._1).mapValues(_.size)
val tupleMap = list.groupBy(identity).mapValues(_.size).
map{ case ((x, y), c) => ((x, y), c.toDouble / t1Map(x)) }.
groupBy(_._1._1).mapValues(_.map(_._2).max)
// tupleMap: scala.collection.immutable.Map[String,Double] = Map(
// b -> 0.6666666666666666, a -> 0.6666666666666666, c -> 1.0
// )
[UPDATE]
要捕获具有相应最大出现次数的整个元组,这里采用不同的方法:
val tupleMap = list.groupBy(identity).mapValues(_.size)
// tupleMap: scala.collection.immutable.Map[(String, Int),Int] = Map(
// (b,2) -> 1, (a,2) -> 1, (c,4) -> 3, (a,1) -> 2, (b,3) -> 2
// )
val t1Map = list.groupBy(_._1).mapValues(_.size)
// t1Map: scala.collection.immutable.Map[String,Int] = Map(b -> 3, a -> 3, c -> 3)
val t1MapMax = tupleMap.groupBy(_._1._1).mapValues(_.map(_._2).max)
// t1MapMax: scala.collection.immutable.Map[String,Int] = Map(b -> 2, a -> 2, c -> 3)
val resultMap = tupleMap.filter{ case (k, v) => v == t1MapMax(k._1) }.
map{ case (k, v) => (k._1, k._2, v.toDouble / t1Map(k._1) ) }
// resultMap: scala.collection.immutable.Iterable[(String, Int, Double)] = List(
// (c,4,1.0), (a,1,0.6666666666666666), (b,3,0.6666666666666666)
// )
答案 1 :(得分:0)
你可以做的是同样在单行中应用一些高阶魔法:
scala> elems
res1: List[(Char, Int)] = List((a,1), (a,2), (d,1), (b,3), (a,4), (d,5))
scala> elems.groupBy(_._1).map(tuple => (tuple._1, tuple._2.map(_._2).max, tuple._2.length/(elems.length:Float)))
res2: scala.collection.immutable.Iterable[(Char, Int, Float)] = List((b,3,0.16666667), (d,5,0.33333334), (a,4,0.5))
基本上,第一个groupBy
将每个单词按地图中的元素分组。
scala> val groupedElems = elems.groupBy(_._1)
groupedElems: scala.collection.immutable.Map[Char,List[(Char, Int)]] = Map(b -> List((b,3)), d -> List((d,1), (d,5)), a -> List((a,1), (a,2), (a,4)))
之后,我们将获得构建最终解决方案的所有信息:
scala> groupedElems.map(tuple =>
| (tuple._1 // I want the name
| , tuple._2.map(_._2).max // along with the max
| , tuple._2.length / (elems.length: Float) // and the occurrences
| )
| )
res33: scala.collection.immutable.Iterable[(Char, Int, Float)] =
List((b,3,0.16666667), (d,5,0.33333334), (a,4,0.5))