应用错误收集

B-Tree和Trie的搜索速度比较

时间：2017-04-09 16:40:12

标签： algorithm search data-structures trie b-tree

我试图找出哪种搜索速度更高效，无论是trie还是B-Tree。我有一个英文单词词典，我希望有效地找到该词典中的一个单词。

2 个答案:

答案 0 :(得分：1)

如果通过＆＃34;更有效的搜索时间＆＃34;你提到理论时间复杂度，然后B树为搜索提供O(logn * |S|) ¹时间复杂度，而trie提供O(|S|)时间复杂度，其中|S|是长度搜索到的字符串，n是字典中元素的数量。

如果通过＆＃34;更有效的搜索时间＆＃34;你指的是实际的实际运行时间，这取决于实际的实现，实际的数据和实际的搜索行为。一些可能影响答案的例子：

数据大小
存储系统（例如：RAM / Flah / disk / distributed filesystem /...)
搜索分发
每项实施的代码优化
（以及更多）

（1）进行O(logn)次比较，每次比较需要O(|S|)次，因为您需要遍历整个字符串以确定哪个更高（最坏情况分析）。

答案 1 :(得分：0)

It depends on what's your need. If you want to get the whole subtree, a B+Tree is your best choice because it is space efficient and also the branching factor of the B+ Tree affects its performance (the number of intermediary nodes). If h is the height of the tree, then nmax ~~ bh. Therefore h ~~ log(nmax) / log(b).

With n = 1 000 000 000 and b = 100, we have h ~~ 5. Therefore it means only 5 pointer dereferencing for going from the root to the leaf. It's more cache-friendly than a Trie.

But if you want to get the first N children from a substree, then a Trie is the best choice because you simply visit less nodes than in a B+ Tree scenario. Also the word prefix completion is well handled by trie.