假设您不知道要搜索的元素数量,并且给定一个接受索引的API,并且如果超出边界将返回null(如此处使用getWordFromDictionary方法实现),如何执行二进制文件为客户端程序搜索并实现isWordInDictionary()方法?
此解决方案有效,但我最终在我找到初始高索引值的级别上进行了序列搜索。通过较低范围的值进行搜索的灵感来自this answer。我还偷看了Reflector中的BinarySearch(C#反编译器),但它有一个已知的列表长度,所以仍然希望填补空白。
private static string[] dictionary;
static void Main(string[] args)
{
dictionary = System.IO.File.ReadAllLines(@"C:\tmp\dictionary.txt");
Console.WriteLine(isWordInDictionary("aardvark", 0));
Console.WriteLine(isWordInDictionary("bee", 0));
Console.WriteLine(isWordInDictionary("zebra", 0));
Console.WriteLine(isWordInDictionaryBinary("aardvark"));
Console.WriteLine(isWordInDictionaryBinary("bee"));
Console.WriteLine(isWordInDictionaryBinary("zebra"));
Console.ReadLine();
}
static bool isWordInDictionaryBinary(string word)
{
// assume the size of the dictionary is unknown
// quick check for empty dictionary
string w = getWordFromDictionary(0);
if (w == null)
return false;
// assume that the length is very big.
int low = 0;
int hi = int.MaxValue;
while (low <= hi)
{
int mid = (low + ((hi - low) >> 1));
w = getWordFromDictionary(mid);
// If the middle element m you select at each step is outside
// the array bounds (you need a way to tell this), then limit
// the search to those elements with indexes small than m.
if (w == null)
{
hi = mid;
continue;
}
int compare = String.Compare(w, word);
if (compare == 0)
return true;
if (compare < 0)
low = mid + 1;
else
hi = mid - 1;
}
// punting on the search above the current value of hi
// to the (still unknown) upper limit
return isWordInDictionary(word, hi);
}
// serial search, works good for small number of items
static bool isWordInDictionary(string word, int startIndex)
{
// assume the size of the dictionary is unknown
int i = startIndex;
while (getWordFromDictionary(i) != null)
{
if (getWordFromDictionary(i).Equals(word, StringComparison.OrdinalIgnoreCase))
return true;
i++;
}
return false;
}
private static string getWordFromDictionary(int index)
{
try
{
return dictionary[index];
}
catch (IndexOutOfRangeException)
{
return null;
}
}
答案后的最终守则
static bool isWordInDictionaryBinary(string word)
{
// assume the size of the dictionary is unknown
// quick check for empty dictionary
string w = getWordFromDictionary(0);
if (w == null)
return false;
// assume that the number of elements is very big
int low = 0;
int hi = int.MaxValue;
while (low <= hi)
{
int mid = (low + ((hi - low) >> 1));
w = getWordFromDictionary(mid);
// treat null the same as finding a string that comes
// after the string you are looking for
if (w == null)
{
hi = mid - 1;
continue;
}
int compare = String.Compare(w, word);
if (compare == 0)
return true;
if (compare < 0)
low = mid + 1;
else
hi = mid - 1;
}
return false;
}
答案 0 :(得分:4)
您可以分两个阶段实施二进制搜索。在第一阶段,您将增加您正在搜索的间隔的大小。一旦检测到您超出边界,就可以在找到的最新间隔内进行正常的二分查找。像这样:
bool isPresentPhase1(string word)
{
int l = 0, d = 1;
while( true ) // you should eventually reach an index out of bounds
{
w = getWord(l + d);
if( w == null )
return isPresentPhase2(word, l, l + d - 1);
int c = String.Compare(w, word);
if( c == 0 )
return true;
else if( c < 0 )
isPresentPhase2(value, l, l + d - 1);
else
{
l = d + 1;
d *= 2;
}
}
}
bool isPresentPhase2(string word, int lo, int hi)
{
// normal binary search in the interval [lo, hi]
}
答案 1 :(得分:2)
当然可以。从索引1开始,并将查询索引加倍,直到遇到的词汇量大于查询词(Edit:或null)。然后,您可以再次缩小搜索空间范围,直到找到索引,或返回false。
编辑:请注意,这不会添加到渐近运行时,它仍然是O(logN),其中N是系列中的项目数。
答案 2 :(得分:0)
所以,我不确定我是否完全理解你的描述中的问题,但我假设你正在尝试搜索未知长度的排序数组来查找特定的字符串。我还假设实际数组中没有空值;如果要求索引超出范围,则数组仅返回null。
如果这些都是真的,那么解决方案应该只是一个标准的二进制搜索,尽管你在整个整数空间中进行搜索,你只需要将null视为找到你要查找的字符串之后的字符串。 。基本上只是想象你的N个字符串的排序数组实际上是一个排序的INT_MAX字符串数组,最后用空值排序。
我不太明白的是,你似乎基本上已经完成了(至少从粗略看一下代码),所以我想我可能完全不了解你的问题。