Question

我有一个排序列表l（大约20,000个元素），并希望找到l中超过给定值t_min的第一个元素。目前，我的代码如下。

def find_index(l):   
    first=next((t for t in l if t>t_min), None) 
    if first==None:
        return None
    else:
        return l.index(first)

为了对代码进行基准测试，我使用cProfile运行测试循环，并通过比较控制循环的时间来删除随机生成列表所需的时间：

import numpy
import cProfile

def test_loop(n):
    for _ in range(n):
        test_l=sorted(numpy.random.random_sample(20000))
        find_index(test_l, 0.5)

def control_loop(n):
    for _ in range(n):
        test_l=sorted(numpy.random.random_sample(20000))

# cProfile.run('test_loop(1000)') takes 10.810 seconds
# cProfile.run('control_loop(1000)') takes 9.650 seconds

find_index的每个函数调用大约需要1.16 ms。有没有办法改进代码以提高代码效率，因为我们知道列表已经排序了？

Answer 1

标准库bisect模块对此非常有用，以及此用例的文档contain an example。

def find_gt(a, x):
    'Find leftmost value greater than x'
    i = bisect_right(a, x)
    if i != len(a):
        return a[i]
    raise ValueError

有效地查找排序列表中元素的索引

1 个答案: