使用二进制搜索获取最接近值的索引

时间:2014-05-15 14:58:35

标签: python binary-search

我想在python中进行二进制搜索:

def binarySearch(data, val):

其中data是排序数组,value是要搜索的值。如果找到该值,我想返回index(例如data[index] = val)。如果找不到该值,我想返回最接近该值的项目的index

这就是我所拥有的:

def binarySearch(data, val):
    high = len(data)-1
    low = 0
    while True:
        index = (high + low) / 2
        if data[index] == val:
            return index
        if data[index] < val:
            low = index
        if data[index] > val:
            high = index

5 个答案:

答案 0 :(得分:9)

如果找到值,这里将返回索引的代码,否则最接近该值的项的索引,希望它有帮助。

def binarySearch(data, val):
    lo, hi = 0, len(data) - 1
    best_ind = lo
    while lo <= hi:
        mid = lo + (hi - lo) / 2
        if data[mid] < val:
            lo = mid + 1
        elif data[mid] > val:
            hi = mid - 1
        else:
            best_ind = mid
            break
        # check if data[mid] is closer to val than data[best_ind] 
        if abs(data[mid] - val) < abs(data[best_ind] - val):
            best_ind = mid
    return best_ind

def main():
    data = [1, 2, 3, 4, 5, 6, 7]
    val = 6.1
    ind = binarySearch(data, val)
    print 'data[%d]=%d' % (ind, data[ind])

if __name__ == '__main__':
    main()

答案 1 :(得分:4)

这样的事情应该有效。它返回一个包含两个索引的数组。如果找到val,则返回数组中的两个值都相同。否则,它返回最接近val的两个项的索引。

def binarySearch(data, val):
    highIndex = len(data)-1
    lowIndex = 0
    while highIndex > lowIndex:
            index = (highIndex + lowIndex) / 2
            sub = data[index]
            if data[lowIndex] == val:
                    return [lowIndex, lowIndex]
            elif sub == val:
                    return [index, index]
            elif data[highIndex] == val:
                    return [highIndex, highIndex]
            elif sub > val:
                    if highIndex == index:
                            return sorted([highIndex, lowIndex])
                    highIndex = index
            else:
                    if lowIndex == index:
                            return sorted([highIndex, lowIndex])
                    lowIndex = index
    return sorted([highIndex, lowIndex])

答案 2 :(得分:0)

这是二进制搜索的示例实现。我不会为你做所有的(家庭?)工作,我相信你可以弄清楚如何存储和返回最接近的值的索引。

# BINARY SEARCH: O(log n), search space halfed each step
def biSearch(lst, find): # expects sorted lst 
    lowIndex = 0
    highIndex = len(lst) - 1
    midIndex = (lowIndex + highIndex)//2
    lastMid = None
    steps = 0
    while midIndex != lastMid:
        steps += 1
        if lst[midIndex] == find:
            return (midIndex, steps)
        if lst[midIndex] < find:
            lowIndex = midIndex + 1
        else:
            highIndex = midIndex - 1
        lastMid = midIndex    
        midIndex = (lowIndex + highIndex)//2
    return (-1, steps)

答案 3 :(得分:0)

我知道这是一个老问题,但谷歌的结果很高,我遇到了同样的问题。有一个内置的功能,它使用二进制搜索,允许你输入参考数组和比较数组。

numpy.searchsorted(a, v, side='left', sorter=None)

a是参考数组(原始问题中的 data ),v是要比较的数组( val来自问题)。这将返回array大小为v的索引的int值,v的第n个元素需要插入a以保留{{1}中的排序顺序“a关键字确定您是否希望将side的元素放置在v中的”左“(前)或”右“(后)的适当值中

[截至2017年7月的文档链接] https://docs.scipy.org/doc/numpy/reference/generated/numpy.searchsorted.html#numpy.searchsorted

答案 4 :(得分:0)

不是这个问题的答案。但是我在这里试图找出如何在排序列表中获取给定目标项的两个周围值。

如果有其他人在看,这是我根据其他一些答案提出的。

import random


def get_nearest(items, target):
    print(f'looking for {target}')
    high_index = len(items) - 1
    low_index = 0

    if not items[low_index] <= target <= items[high_index]:
        raise ValueError(f'The target {target} is not in the range of'
                         f' provided items {items[low_index]}:{items[high_index]}')

    if target in items:
        return target, target

    while high_index > low_index:
        index = int((high_index + low_index) / 2)
        sub = items[index]

        if sub > target:
            if high_index == index:
                return tuple(sorted([items[high_index], items[low_index]]))
            high_index = index
        else:
            if low_index == index:
                return tuple(sorted([items[high_index], items[low_index]]))
            low_index = index
    return tuple(sorted([items[high_index], items[low_index]]))


if __name__ == '__main__':
    my_randoms = sorted(random.sample(range(10000000), 100000))
    x = 340000
    print(get_nearest(my_randoms, x))

    x = 0
    my_randoms = [x] + my_randoms
    print(get_nearest(my_randoms, x))

    x = 10000000
    my_randoms.append(x)
    print(get_nearest(my_randoms, x))

    idx = random.randint(0, 100000)
    x = my_randoms[idx]
    print(get_nearest(my_randoms, x))