更好和有效的方法来组合scala中的元组2列表

时间:2014-03-02 08:37:49

标签: scala group-by grouping scala-collections

我有Scala Tuples2的列表,我必须将它们分组。我目前使用以下方式来执行它。

var matches:List[Tuple2[String,Int]]
var m = matches.toSeq.groupBy(i=>i._1).map(t=>(t._1,t._2)).toSeq.sortWith(_._2.size>_._2.size).sortWith(_._2.size>_._2.size)

以上分组给了我     的序号[(字符串,SEQ [(字符串,整数)])] 但我想拥有     的序号[(字符串,SEQ [INT])]

我想知道有没有更好更有效的方法。

1 个答案:

答案 0 :(得分:4)

首先,一些想法:

// You should use `val` instead of `var`
var matches: List[Tuple2[String, Int]] = List("a" -> 1, "a" -> 2, "b" -> 3, "c" -> 4, "c" -> 5)
var m = matches
  .toSeq                            // This isn't necessary: it's already a Seq
  .groupBy(i => i._1)
  .map(t => (t._1, t._2))           // This isn't doing anything at all
  .toSeq
  .sortWith(_._2.size > _._2.size)  // `sortBy` will reduce redundancy
  .sortWith(_._2.size > _._2.size)  // Not sure why you have this twice since clearly the 
                                    // second sorting isn't doing anything...

所以试试这个:

val matches: List[Tuple2[String, Int]] = List("a" -> 1, "a" -> 2, "b" -> 3, "c" -> 4, "c" -> 5)
val m: Seq[(String, Seq[Int])] = 
  matches
    .groupBy(_._1)
    .map { case (k, vs) => k -> vs.map(_._2) }  // Drop the String part of the value
    .toVector
    .sortBy(_._2.size)
println(m) // Vector((b,List(3)), (a,List(1, 2)), (c,List(4, 5)))