B-Tree和Trie的搜索速度比较

时间:2017-04-09 16:40:12

标签: algorithm search data-structures trie b-tree

我试图找出哪种搜索速度更高效,无论是trie还是B-Tree。我有一个英文单词词典,我希望有效地找到该词典中的一个单词。

2 个答案:

答案 0 :(得分:1)

如果通过"更有效的搜索时间"你提到理论时间复杂度,然后B树为搜索提供O(logn * |S|) 1 时间复杂度,而trie提供O(|S|)时间复杂度,其中|S|是长度搜索到的字符串,n是字典中元素的数量。

如果通过"更有效的搜索时间"你指的是实际的实际运行时间,这取决于实际的实现,实际的数据和实际的搜索行为。一些可能影响答案的例子:

  • 数据大小
  • 存储系统(例如:RAM / Flah / disk / distributed filesystem /...)
  • 搜索分发
  • 每项实施的代码优化
  • (以及更多)

(1)进行O(logn)次比较,每次比较需要O(|S|)次,因为您需要遍历整个字符串以确定哪个更高(最坏情况分析)。

答案 1 :(得分:0)

It depends on what's your need. If you want to get the whole subtree, a B+Tree is your best choice because it is space efficient and also the branching factor of the B+ Tree affects its performance (the number of intermediary nodes). If h is the height of the tree, then nmax ~~ bh. Therefore h ~~ log(nmax) / log(b).

With n = 1 000 000 000 and b = 100, we have h ~~ 5. Therefore it means only 5 pointer dereferencing for going from the root to the leaf. It's more cache-friendly than a Trie.

But if you want to get the first N children from a substree, then a Trie is the best choice because you simply visit less nodes than in a B+ Tree scenario. Also the word prefix completion is well handled by trie.