我正在寻求解决以下问题。我有一个numpy数组,标记为从1到n的区域。我们假设这是数组:
x = np.array([[1, 1, 1, 4], [1, 1, 2, 4], [1, 2, 2, 4], [5, 5, 3, 4]], np.int32)
array([[1, 1, 1, 4],
[1, 1, 2, 4],
[1, 2, 2, 4],
[5, 5, 3, 4]])
区域是numpy数组中具有唯一值的组合单元格。所以在这个例子中x有5个区域;区域1由5个细胞组成,区域2由3个细胞组成。 现在,我使用以下代码行确定每个区域的相邻区域:
n = x.max()
tmp = np.zeros((n+1, n+1), bool)
# check the vertical adjacency
a, b = x[:-1, :], x[1:, :]
tmp[a[a!=b], b[a!=b]] = True
# check the horizontal adjacency
a, b = x[:, :-1], x[:, 1:]
tmp[a[a!=b], b[a!=b]] = True
# register adjacency in both directions (up, down) and (left,right)
result = (tmp | tmp.T)
result = result.astype(int)
np.column_stack(np.nonzero(result))
resultlist = [np.flatnonzero(row) for row in result[1:]]
其中列出了每个区域及其相邻区域的列表:
[array([2, 4, 5], dtype=int64),
array([1, 3, 4, 5], dtype=int64),
array([2, 4, 5], dtype=int64),
array([1, 2, 3], dtype=int64),
array([1, 2, 3], dtype=int64)]
哪个效果很好。但是,我想计算每个相邻区域的单元格数量并返回此输出。因此,对于区域2,在该示例中将意味着总共7个相邻区域(三个1,两个4,一个3和一个5)。因此:
我怎样才能最好地调整上面的代码以包含每个相邻区域的单元格数量? 非常感谢你们!
答案 0 :(得分:3)
这是一个使用numpy_indexed包的矢量化解决方案(注意;它不是在区域上进行矢量化,而是在像素上进行矢量化,假设n_pixels>>这是有用的; n_regions):
neighbors = np.concatenate([x[:, :-1].flatten(), x[:, +1:].flatten(), x[+1:, :].flatten(), x[:-1, :].flatten()])
centers = np.concatenate([x[:, +1:].flatten(), x[:, :-1].flatten(), x[:-1, :].flatten(), x[+1:, :].flatten()])
valid = neighbors != centers
import numpy_indexed as npi
regions, neighbors_per_regions = npi.group_by(centers[valid], neighbors[valid])
for region, neighbors_per_region in zip(regions, neighbors_per_regions):
print(region)
unique_neighbors, neighbor_counts = npi.count(neighbors_per_region)
print(unique_neighbors, neighbor_counts / neighbor_counts.sum() * 100)
或者对于在像素和区域上完全矢量化的解决方案:
(neighbors, centers), counts = npi.count((neighbors[valid], centers[valid]))
region_group = group_by(centers)
regions, neighbors_per_region = region_group.sum(counts)
fractions = counts / neighbors_per_region[region_group.inverse]
for q in zip(centers, neighbors, fractions): print(q)