Question

给定两个数组，一个表示数据流，另一个表示组计数，例如：

target_count = 3 # I want to make a matrix of all data items who's group_counts = target_count
# Expected result
# [[ 0  1  2]
#  [ 7  8  9]]

我想根据流数据的组计数生成矩阵，例如：

# Find all matches
match = np.where(groups == group_target)[0]
i1 = np.cumsum(groups)[match] # start index for slicing
i0 = i1 - groups[match] # end index for slicing

# Prep the blank matrix and fill with resuls
matched_matrix = np.empty((match.size,target_count))

# Is it possible to get rid of this loop?
for i in xrange(match.size):
    matched_matrix[i] = data[i0[i]:i1[i]]

matched_matrix
# Result: array([[ 0,  1,  2],
                 [ 7,  8,  9]]) #

为此，我写了以下内容：

numpy.split

这有效，但我想摆脱循环，我无法弄清楚如何。

做了一些研究我确实找到了numpy.array_split和match = np.where(group_counts == target_count)[0] match = np.array(np.split(data,np.cumsum(groups)))[match] # Result: array([array([0, 1, 2]), array([7, 8, 9])], dtype=object) #：

numpy.split

但是dtype=object会生成一个我必须转换的java.sql列表。

是否有一种优雅的方法可以在没有循环的情况下产生所需的结果？

Answer 1

您可以重复group_counts，使其与数据大小相同，然后根据目标进行过滤和重塑：

group_counts = np.array([3,4,3,2])
data = np.arange(group_counts.sum())

target = 3
data[np.repeat(group_counts, group_counts) == target].reshape(-1, target)

#array([[0, 1, 2],
#       [7, 8, 9]])

使用组计数数组对numpy数组的元素进行分组

1 个答案: