在numpy数组中找到连续的

时间:2016-02-24 19:04:44

标签: numpy

如何在以下numpy数组的每一行中找到连续1(或任何其他值)的数量。我需要一个纯粹的numpy解决方案。

counts
Out[304]: 
array([[0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0],
       [0, 0, 1, 0, 0, 1, 2, 0, 0, 1, 1, 1],
       [0, 0, 0, 4, 1, 0, 0, 0, 0, 1, 1, 0]])

理想的解决方案 第一个问题(连续1的最大数量是多少): 数量:数组([2,3,2])

第二个问题(连续2x a 1的索引: index:array([3,9,9])

在这个例子中,我连续放了2倍。但是应该可以将其改为连续5倍,这很重要。

问题的第二部分是,一旦找到哪些连续1个(或任何其他值),我将需要if的起始索引。这应该是按行完成的。

np.unique回答了一个类似的问题,但它只适用于一行而不是一个有多行的数组,因为结果会有不同的长度:Get a list of all indices of repeated elements in a numpy array

2 个答案:

答案 0 :(得分:3)

这是基于differentiation -

的矢量化方法
import numpy as np
import pandas  as pd

# Append zeros columns at either sides of counts
append1 = np.zeros((counts.shape[0],1),dtype=int)
counts_ext = np.column_stack((append1,counts,append1))

# Get start and stop indices with 1s as triggers
diffs = np.diff((counts_ext==1).astype(int),axis=1)
starts = np.argwhere(diffs == 1)
stops = np.argwhere(diffs == -1)

# Get intervals using differences between start and stop indices
start_stop = np.column_stack((starts[:,0], stops[:,1] - starts[:,1]))

# Get indices corresponding to max. interval lens and thus lens themselves
SS_df = pd.DataFrame(start_stop)
out = start_stop[SS_df.groupby([0],sort=False)[1].idxmax(),1]

示例输入,输出 -

原始样本案例:

In [574]: counts
Out[574]: 
array([[0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0],
       [0, 0, 1, 0, 0, 1, 2, 0, 0, 1, 1, 1],
       [0, 0, 0, 4, 1, 0, 0, 0, 0, 1, 1, 0]])

In [575]: out
Out[575]: array([2, 3, 2], dtype=int64)

修改案例:

In [577]: counts
Out[577]: 
array([[0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0],
   [0, 0, 1, 0, 0, 1, 2, 0, 1, 1, 1, 1],
   [0, 0, 0, 4, 1, 1, 1, 1, 1, 0, 1, 0]])

In [578]: out
Out[578]: array([2, 4, 5], dtype=int64)

这是一个Pure NumPy版本,与之前相同,直到我们开始,停止。这是完整的实施 -

# Append zeros columns at either sides of counts
append1 = np.zeros((counts.shape[0],1),dtype=int)
counts_ext = np.column_stack((append1,counts,append1))

# Get start and stop indices with 1s as triggers
diffs = np.diff((counts_ext==1).astype(int),axis=1)
starts = np.argwhere(diffs == 1)
stops = np.argwhere(diffs == -1)

# Get intervals using differences between start and stop indices
intvs = stops[:,1] - starts[:,1]

# Store intervals as a 2D array for further vectorized ops to make.
c = np.bincount(starts[:,0])
mask = np.arange(c.max()) < c[:,None]
intvs2D = mask.astype(float)
intvs2D[mask] = intvs

# Get max along each row as final output
out = intvs2D.max(1)

答案 1 :(得分:0)

我认为一个非常相似的问题是检查排序行之间的元素明智差异是否是一定量。这里如果连续5次之间存在1的差异,则如下所示。对于两张牌,它也可以为0的差异做到:

cardAmount=cards[0,:].size
has4=cards[:,np.arange(0,cardAmount-4)]-cards[:,np.arange(cardAmount-3,cardAmount)]
isStraight=np.any(has4 == 4, axis=1)