Question

这是我打算用Python做的事情：

我有一个数组（freq_arr）。我想找到第一组非零元素的索引。我从start开始搜索非零元素，当我找到第一个非零元素（第一个元素是5，在下面的例子中）时，我记录它的索引（4，在下面的例子中）。我搜索下一个，并记录其索引（将为5）。如果我遇到一个零，我想忽略它并继续搜索非零值。这样，我考虑值为5,5,0,8,9,0,1，索引为4,5,6,7,8,9和10.在这些值之后，有五个零，因此我停止搜索。输出中最多可以存在两个零，并继续搜索。但是，如果我遇到3个或更多的零，我想停止搜索。

输入：

freq_arr = np.array([0, 0, 0, 0, 5, 6, 0, 8, 9, 0, 1, 0, 0, 0, 0, 3, 6, 0])

输出：

out_arr_indices = [4, 5, 6, 7, 8, 9, 10]

我知道使用for循环对此进行编码，但我想避免使用它，因为它效率不高。请告诉我如何做到这一点。

数组将是单维的。每个元素的范围都在5000到20000之间。

Answer 1

如果我理解你的问题，你想要遍历列表，跳过连续两个或更少的零，并将非零值的索引添加到输出数组。也许像下面这样的东西

freq_arr = [0, 0, 5, 6, 0, 8, 9, 0, 1, 0, 0, 0, 0, 3, 6, 0]
outputarr = []

count = 0
zerocount = 0

while count < len(freq_arr) and zerocount < 3:
    if freq_arr[count] == 0:
        zerocount += 1
    else:
        zerocount = 0
        outputarr.append(count)
    count += 1

如果您提供更多详细信息，我们可以提供更好的帮助。

Answer 2

这是切片和argmax（检测非零和零）的一种方法 -

def start_stop_indices(freq_arr, W=3):
    nnz_mask = freq_arr!=0
    start_idx = nnz_mask.argmax()
    m0 = nnz_mask[start_idx:]
    kernel = np.ones(W,dtype=int)
    last_idx = np.convolve(m0, kernel).argmin() + start_idx - W
    return start_idx, last_idx

样品运行 -

In [203]: freq_arr
Out[203]: array([0, 0, 0, 0, 5, 6, 0, 8, 9, 0, 1, 0, 0, 0, 0, 3, 6, 0])

In [204]: start_stop_indices(freq_arr, W=3)
Out[204]: (4, 10)

In [205]: start_stop_indices(freq_arr, W=2)
Out[205]: (4, 10)

In [206]: start_stop_indices(freq_arr, W=1)
Out[206]: (4, 5)

这是另一个长度= 3的固定窗口搜索，避免使用convolution并更多地使用slicing -

def start_stop_indices_v2(freq_arr):
    nnz_mask = freq_arr!=0
    start_idx = nnz_mask.argmax()
    m0 = nnz_mask[start_idx:]
    idx0 = (m0[:-2] | m0[1:-1] | m0[2:]).argmin()
    last_idx = idx0 + start_idx - 1
    return  start_idx, last_idx

第一组非零值（忽略单次出现的零）

2 个答案: