unordered_set :: find操作需要线性时间

时间:2019-12-22 14:09:47

标签: c++ time-complexity unordered-set

我是C ++的新手,并认为C ++ unordered_set应该像实现那样是一个哈希表,并且可以提供O(1)恒定时间访问,但是我要解决的问题似乎在find期间花费了线性时间。操作。

bool isWordInSet(string word, unordered_set<string> set) {
    return set.find(word) != set.end();
}

int main() {
...
unordered_set<string> wordSets[fileCount]; // pre-filled array of unordered_set<string>
...

for (int i = 0; i < fileCount; i++) {
    unordered_set<string>::const_iterator it = wordSets[i].begin();
    while (it != wordSets[i].end()) {
        haystack.insert(*it);
        // haystack.insert(*it + "ss"); // added in order to double set's size
        // haystack.insert(*it + "ss2"); // added in order to triple set's size
        it++;
    }
}

int common = 0;
double t = omp_get_wtime();
unordered_set<string>::const_iterator it = needle.begin();
// traverse needle set to find items that are common with haystack
while (it != needle.end()) {
    // if (haystack.find(*it2) != haystack.end()) -> this takes O(1), but below is linear
    if (isWordInSet(*it, haystack)) // takes proportional time to haystack's size
        common++;
    it++;
}
t = omp_get_wtime() - t;
cout << "SizeHay: " << haystack.size() << " Time: " << t * 10'000 << "\n";
}

当我取消注释行haystack.insert(*it + "ss");haystack.insert(*it + "ss2")时,完成遍历和搜索针组所需的时间成比例增加。这是预期的行为,还是我执行find的方式有问题?

编辑:事实证明,当我调用isWordInSet函数时,它仅花费比例时间,而当我用haystack.find(*it) != haystack.end()内联时,它不花费比例时间。这对我来说真的很奇怪。我相信该函数只会被调用needle.size()次,而与haystack.size()无关。

0 个答案:

没有答案