Question

我有以下数据结构：

val set: scala.collection.immutable.Set[String] = ...
val test1: scala.collection.immutable.Map[String,scala.collection.immutable.Set[String]] = ...
val test2: Array[scala.collection.immutable.Set[String]] = ...

set包含约60,000个entires。 test1有两个条目（“一个”和“两个”），每个条目都是一组类似于set的字符串。 test2与test1类似，但密钥为0和1。

运行test1.get("one").get.contains("somestring")需要很长时间（约1秒），但运行test2(0).contains("somestring")非常快。

我不太明白为什么会有这么大的差异。有什么想法吗？

Answer 1

问题是我在现有地图上使用mapValues来生成新地图。我认为mapValues与map的工作方式类似，但实际上mapValues只会在现有地图上创建一个视图而不是新地图。

Answer 2

此：

test2(0)

运行速度非常快，因为它是一个数组，它确切知道0的确切位置，地图必须首先找到“一个”键，然后才能找到它的对。

Answer 3

运行此代码以生成您提到的一些集合：

import scala.collection.mutable.{HashSet, HashMap}
import scala.util.Random

def genSet(count: Int = 100 * 1000, stringSize: Int = 10): Set[String] = {
  val set = new HashSet[String]
  set.sizeHint(count)

  for(i <- 1 to count) {
    set.add(i.toString)
  }

  set.toSet
} 

def genSetMap(count: Int = 2, keySize: Int = 10)
             (f: => Set[String]): Map[String, Set[String]] = {
  val map = new HashMap[String, Set[String]]
  map.sizeHint(count)

  for(i <- 1 to count) {
    map.put(i.toString, genSet())
  }

  map.toMap
}

以下测试，每组使用100.000个元素，仍然立即运行：

val map = genSetMap(2, 10){ genSet(100*1000) }
map("2").contains("99999") // res2: Boolean = true

所以我怀疑你的实际代码中有一些特性导致它不生成一个集合，但是其他一些没有快速搜索的中间集合。您能否提供一个更具体的实例代码示例？

为什么在这个特定代码中使用Map进行查找需要花费相当长的时间？

3 个答案: