二进制搜索最接近的值小于或等于搜索值

时间:2015-03-22 16:17:47

标签: algorithm search binary-search

我试图编写一种算法来查找最接近的值的索引,该索引小于或等于排序数组中的搜索值。在数组 [10,20,30] 的示例中,以下搜索值应输出这些索引:

  1. searchValue:9,index:-1
  2. searchValue:10,index:0
  3. searchValue:28,index:1
  4. searchValue:55555,index:2
  5. 我想使用二进制搜索来进行对数运行时。我有一个C-esque伪代码的算法,但它有3个基本情况。这3个基本案例可以缩减为1,以获得更优雅的解决方案吗?

    int function indexOfClosestLesser(array, searchValue, startIndex, endIndex) {
      if (startIndex == endIndex) {
        if (searchValue >= array[startIndex]) {
          return startIndex;
        } else {
          return -1;
        }
      }
    
      // In the simplistic case of searching for 2 in [0, 2], the midIndex
      // is always 0 due to int truncation. These checks are to avoid recursing
      // infinitely from index 0 to index 1. 
      if (startIndex == endIndex - 1) {
        if (searchValue >= array[endIndex]) {
          return endIndex;
        } else if (searchValue >= array[startIndex]) {
          return startIndex;
        } else {
          return -1;
        }
      }
    
      // In normal binary search, this would be the only base case
      if (startIndex < endIndex) {
        return -1;
      }
    
      int midIndex = endIndex / 2 + startIndex / 2;
      int midValue = array[midIndex];
    
      if (midValue > searchValue) {
        return indexOfClosestLesser(array, searchValue, startIndex, midIndex - 1);
      } else if (searchValue >= midValue) {
        // Unlike normal binary search, we don't start on midIndex + 1.
        // We're not sure whether the midValue can be excluded yet
        return indexOfClosestLesser(array, searchValue, midIndex, endIndex);
      }
    }
    

7 个答案:

答案 0 :(得分:6)

根据您的递归方法,我建议使用以下c++代码段来减少不同案例的数量:

int search(int *array, int start_idx, int end_idx, int search_val) {

   if( start_idx == end_idx )
      return array[start_idx] <= search_val ? start_idx : -1;

   int mid_idx = start_idx + (end_idx - start_idx) / 2;

   if( search_val < array[mid_idx] )
      return search( array, start_idx, mid_idx, search_val );

   int ret = search( array, mid_idx+1, end_idx, search_val );
   return ret == -1 ? mid_idx : ret;
}

基本上它执行普通的二进制搜索。它仅在最后一个案例的返回陈述中有所不同,以满足额外的要求。

这是一个简短的测试程序:

#include <iostream>

int main( int argc, char **argv ) {

   int array[3] = { 10, 20, 30 };

   std::cout << search( array, 0, 2, 9 ) << std::endl;
   std::cout << search( array, 0, 2, 10 ) << std::endl;
   std::cout << search( array, 0, 2, 28 ) << std::endl;
   std::cout << search( array, 0, 2, 55555 ) << std::endl;

   return 0;
}

输出符合要求:

-1
 0
 1
 2

答案 1 :(得分:0)

这是一个PHP版本,基于user0815的答案。

将其改编为一个函数,而不仅仅是一个数组,并通过避免两次$ mid_idx的评估来提高它的效率。

function binarySearchLessOrEqual($start_idx, $end_idx, $search_val, $valueFunction)
{
    //N.B. If the start index is bigger or equal to the end index, we've reached the end!
    if( $start_idx >= $end_idx )
    {
        return $valueFunction($end_idx) <= $search_val ? $end_idx : -1;
    }

    $mid_idx = intval($start_idx + ($end_idx - $start_idx) / 2);

    if ( $valueFunction($mid_idx) > $search_val )  //If the function is too big, we search in the bottom half
    {
        return binarySearchLessOrEqual( $start_idx, $mid_idx-1, $search_val, $valueFunction);
    }
    else //If the function returns less than OR equal, we search in the top half
    {
        $ret = binarySearchLessOrEqual($mid_idx+1, $end_idx, $search_val, $valueFunction);

        //If nothing is suitable, then $mid_idx was actually the best one!
        return $ret == -1 ? $mid_idx : $ret;
    }
}

它不是采用数组,而是采用int-indexed函数。您可以轻松地调整它以取代数组,或者只是使用它如下所示:

function indexOfClosestLesser($array, $searchValue) 
{
    return binarySearchLessOrEqual(
        0, 
        count($array)-1, 
        $searchValue,
        function ($n) use ($array) 
        { 
            return $array[$n]; 
        }
    );
}

测试:

$array = [ 10, 20, 30 ];
echo "0:  " . indexOfClosestLesser($array, 0)  . "<br>"; //-1
echo "5:  " . indexOfClosestLesser($array, 5)  . "<br>"; //-1
echo "10: " . indexOfClosestLesser($array, 10) . "<br>"; //0
echo "15: " . indexOfClosestLesser($array, 15) . "<br>"; //0
echo "20: " . indexOfClosestLesser($array, 20) . "<br>"; //1
echo "25: " . indexOfClosestLesser($array, 25) . "<br>"; //1
echo "30: " . indexOfClosestLesser($array, 30) . "<br>"; //2
echo "35: " . indexOfClosestLesser($array, 35) . "<br>"; //2

答案 2 :(得分:0)

尝试使用一对全局变量,然后在COMPARE函数中为bsearch

引用这些变量

在RPGIV中我们可以调用c函数。

具有全局变量的比较函数如下所示:

dcl-proc compInvHdr;
  dcl-pi compInvHdr int(10);
    elmPtr1   pointer value;
    elmPtr2   pointer value;
  end-pi;
  dcl-ds elm1            based(elmPtr1) likeds(invHdr_t);
  dcl-ds elm2            based(elmPtr2) likeds(elm1);
  dcl-s  low             int(10) inz(-1);
  dcl-s  high            int(10) inz(1);
  dcl-s  equal           int(10) inz(0);

  select;
  when elm1.rcd.RECORDNO < elm2.rcd.RECORDNO;
    lastHiPtr = elmPtr2;
    return low;
  when elm1.rcd.RECORDNO > elm2.rcd.RECORDNO;
    lastLoPtr = elmPtr2;
    return high;
  other;
    return equal;
  endsl;
end-proc;

请记住,在bsearch中,第一个元素是搜索键,第二个元素是数组/内存中的实际存储元素,这就是COMPARE过程使用elmPtr2的原因。

对bsearch的调用如下所示:

// lastLoPtr and LastHiPtr are global variables
// basePtr points to the beginning of the array
lastLoPtr = basePtr;  
lastHiPtr = basePtr + ((numRec - 1) * sizRec));
searchKey = 'somevalue'; 
hitPtr = bsearch(%addr(searchkey)
                :basePtr
                :numRec
                :sizRec
                :%PADDR('COMPINVHDR'));
if hitPtr <> *null;
//? not found 
  hitPtr = lastLoPtr;             
else;
//? found
endif;

因此,如果未找到密钥,则将hitPtr设置为最接近匹配的密钥,有效地归档&#34;小于或等于密钥&#34;。

如果你想要相反的话,那就是下一个更大的关键。然后使用lastHiPtr引用大于搜索键的第一个键。

注意:保护全局变量不受竞争条件的影响(如果适用)。

答案 3 :(得分:0)

坦白说,我发现查找大于给定数字的逻辑要比查找小于或等于给定数字的逻辑要容易得多。显然,其背后的原因是处理数组中存在的(给定数字)重复数字所需的额外逻辑(形成边缘情况)。

public int justGreater(int[] arr, int val, int s, int e){
    // Returns the index of first element greater than val. 
    // If no such value is present, returns the size of the array.
    if (s >= e){
        return arr[s] <= N ? s+1 : s;
    }
    int mid = (s + e) >> 1;
    if (arr[mid] < val) return justGreater(arr, val, mid+1, e);
    return justGreater(arr, val, s, mid);
} 

,然后在排序数组中查找小于或等于搜索值的最接近值的索引,只需将返回值减去1:

ans = justGreater(arr, val, 0, arr.length-1) - 1;

答案 4 :(得分:0)

希望提供使用C#的非二进制搜索方式。下面的语句查找最接近X的值,但不大于X,但可以等于X。我的函数也不需要对列表进行排序。从理论上讲,它也比O(n)快,但仅在找到确切目标编号的情况下,这种情况下它才提前终止并返回整数。

    public static int FindClosest(List<int> numbers, int target)
    {
        int current = 0;
        int difference = Int32.MaxValue;
        foreach(int integer in numbers)
        {
            if(integer == target)
            {
                return integer;
            }
            int diff = Math.Abs(target - integer);
            if(integer <= target && integer >= current && diff < difference)
            {
                current = integer;
                difference = diff;
            }
        }
        return current;
    }

我使用以下设置对此进行了测试,它似乎可以正常工作:

            List<int> values = new List<int>() {1,24,32,6,14,9,11,22 };
            int target = 21;
            int closest = FindClosest(values,target);
            Console.WriteLine("Closest: " + closest);

答案 5 :(得分:0)

打字稿中的注释版本。基于this answer,但被修改为返回小于或等于。

/**
 * Binary Search of a sorted array but returns the closest smaller value if the
 * needle is not in the array.
 *
 * Returns null if the needle is not in the array and no smaller value is in
 * the array.
 *
 * @param haystack the sorted array to search @param needle the need to search
 * for in the haystack @param compareFn classical comparison function, return
 * -1 if a is less than b, 0 if a is equal to b, and 1 if a is greater than b
 */
export function lessThanOrEqualBinarySearch<T>(
  haystack: T[],
  needle: T,
  compareFn: (a: T, b: T) => number
): T | null {
  let lo = 0;
  let hi = haystack.length - 1;
  let lowestFound: T | null = null;

  // iteratively search halves of the array but when we search the larger
  // half keep track of the largest value in the smaller half
  while (lo <= hi) {
    let mid = (hi + lo) >> 1;
    let cmp = compareFn(needle, haystack[mid]);

    // needle is smaller than middle
    // search in the bottom half
    if (cmp < 0) {
      hi = mid - 1;
      continue;
    }

    // needle is larger than middle
    // search in the top half
    else if (cmp > 0) {
      lo = mid + 1;
      lowestFound = haystack[mid];
    } else if (cmp === 0) {
      return haystack[mid];
    }
  }
  return lowestFound;
}

答案 6 :(得分:-2)

使用循环的非递归方式,我在javascript中使用了它,所以我只在javascript中发布:

let left = 0
let right = array.length
let mid = 0

while (left < right) {
    mid = Math.floor((left + right) / 2)
    if (searchValue < array[mid]) {
        right = mid
    } else {
        left = mid + 1
    }
}

return left - 1

由于一般指南告诉我们看中间指针,所以许多人没有看到实际答案是左指针的最终值。