scala-在保留顺序的情况下获取列表中单词的连续计数

时间:2019-04-30 21:53:18

标签: scala

我试图通过保留元素的顺序来获得列表中单词的连续计数

scala> val a = List("she","sells","seashells","by","the","seashore","the", "shells", "she", "sells", "are", "surely", "seashells","where", "are", "the", "shells")
a: List[String] = List(she, sells, seashells, by, the, seashore, the, shells, she, sells, are, surely, seashells, where, are, the, shells)

scala> a.map( x => (x,a.count(_ == x)))
res13: List[(String, Int)] = List((she,2), (sells,2), (seashells,2), (by,1), (the,3), (seashore,1), (the,3), (shells,2), (she,2), (sells,2), (are,2), (surely,1), (seashells,2), (where,1), (are,2), (the,3), (shells,2))

scala>

但是我想要的是

List((she,1), (sells,1), (seashells,1), (by,1), (the,1), (seashore,1), (the,2), (shells,1), (she,2), (sells,2), (are,1), (surely,1), (seashells,2), (where,1), (are,2), (the,3), (shells,2))

尝试了类似下面的内容,但是它抛出错误

scala> a.scanLeft(scala.collection.mutable.Map[String,Int]()){ (x,t) => {x(t) = x(t)+1; (x) } }
java.util.NoSuchElementException: key not found: she
  at scala.collection.MapLike$class.default(MapLike.scala:228)
  at scala.collection.AbstractMap.default(Map.scala:59)
  at scala.collection.mutable.HashMap.apply(HashMap.scala:65)
  at $anonfun$1.apply(<console>:13)
  at $anonfun$1.apply(<console>:13)
  at scala.collection.TraversableLike$$anonfun$scanLeft$1.apply(TraversableLike.scala:374)
  at scala.collection.TraversableLike$$anonfun$scanLeft$1.apply(TraversableLike.scala:374)
  at scala.collection.immutable.List.foreach(List.scala:381)
  at scala.collection.TraversableLike$class.scanLeft(TraversableLike.scala:374)
  at scala.collection.AbstractTraversable.scanLeft(Traversable.scala:104)
  ... 32 elided

scala>

2 个答案:

答案 0 :(得分:5)

一个令人费解的foldLeft似乎起作用。

a.foldLeft((List.empty[(String,Int)],Map[String,Int]().withDefaultValue(0))){
  case ((lst,cnts),s) => ((s,cnts(s)+1) :: lst, cnts + ((s,cnts(s)+1)))
}._1.reverse
//res0: List[(String, Int)] = List((she,1), (sells,1), (seashells,1), (by,1), (the,1), (seashore,1), (the,2), (shells,1), (she,2), (sells,2), (are,1), (surely,1), (seashells,2), (where,1), (are,2), (the,3), (shells,2))

答案 1 :(得分:2)

以下是使用与原始代码相同的原理的版本:

a.reverse.tails.collect{case s :: t => (s, t.count(_ == s) + 1)}.toList.reverse

但是对于长列表来说,这是缓慢且效率低下的,所以我会选择前面的答案!

最有效的解决方案是使用可变的MapListBuffer,但是在这种情况下,这似乎有些过分了:

def wordCount(s: List[String]): List[(String, Int)] = {
  val wordMap = collection.mutable.Map.empty[String, Int].withDefaultValue(0)
  val res = collection.mutable.ListBuffer.empty[(String, Int)]

  s.foreach{ s =>
    wordMap(s) += 1
    res += s -> wordMap(s)
  }

  res.toList
}