我有一个非常大的numpy数组...
power = ...
print power.shape
>>> (3, 10, 10, 19, 75, 10, 10)
是对称的w.r.t. 10x10部分,即以下2-d矩阵是对称的
power[i, :, :, j, k, l, m]
power[i, j, k, l, m, :, :]
表示i,j,k,l,m的所有值
我可以利用这个因子-4的增益吗?例如。将矩阵保存到文件时(使用savez_compressed 50 mb)
我的尝试:
size = 10
row_idx, col_idx = np.tril_indices(size)
zip_idx = zip(row_idx, col_idx)
print len(zip_idx), zip_idx[:5]
>>> 55 [(0, 0), (1, 0), (1, 1), (2, 0), (2, 1)]
all_idx = [(r0, c0, r1, c1) for (r0, c0) in zip_idx for (r1, c1) in zip_idx]
print len(all_idx), all_idx[:5]
>>> 3025 [(0, 0, 0, 0), (0, 0, 1, 0), (0, 0, 1, 1), (0, 0, 2, 0), (0, 0, 2, 1)]
a, b, c, d = zip(*all_idx)
tril_part = np.transpose(s.power, (0, 3, 4, 1, 2, 5, 6))[:,:,:, a, b, c, d]
print tril_part.shape
>>> (3, 19, 75, 3025)
这看起来很丑陋,但是"工作" ......一旦我也能从tril_part恢复权力......
我想这会产生两个问题:
编辑:"尺寸"评论显然是有效的,但请忽略它:-)恕我直言,问题的索引部分是独立的。我发现自己想要为较小的矩阵做类似的索引。
答案 0 :(得分:1)
你走在正确的道路上。使用np.tril_indices
,您确实可以巧妙地索引这些较低的三角形。还有待改进的是数据的实际索引/切片。
请尝试此操作(复制和粘贴):
import numpy as np
shape = (3, 10, 10, 19, 75, 10, 10)
p = np.arange(np.prod(shape)).reshape(shape) # this is not symmetric, but not important
ix, iy = np.tril_indices(10)
# In order to index properly, we need to add axes. This can be done by hand or with this
ix1, ix2 = np.ix_(ix, ix)
iy1, iy2 = np.ix_(iy, iy)
p_ltriag = p[:, ix1, iy1, :, :, ix2, iy2]
print p_ltriag.shape # yields (55, 55, 3, 19, 75), axis order can be changed if needed
q = np.zeros_like(p)
q[:, ix1, iy1, :, :, ix2, iy2] = p_ltriag # fills the lower triangles on both sides
q[:, ix1, iy1, :, :, iy2, ix2] = p_ltriag # fills the lower on left, upper on right
q[:, iy1, ix1, :, :, ix2, iy2] = p_ltriag # fills the upper on left, lower on right
q[:, iy1, ix1, :, :, iy2, ix2] = p_ltriag # fills the upper triangles on both sides
数组q
现在包含p
的对称版本(其中上三角形替换为较低三角形的内容)。请注意,最后一行包含反转顺序的iy
和ix
索引,实质上是创建了下三角矩阵的转置。
较低三角形的比较
为了比较,我们将所有上三角形设置为0
ux, uy = np.triu_indices(10)
p[:, ux, uy] = 0
q[:, ux, uy] = 0
p[:, :, :, :, :, ux, uy] = 0
q[:, :, :, :, :, ux, uy] = 0
print ((p - q) ** 2).sum() # euclidean distance is 0, so p and q are equal
print ((p ** 2).sum(), (q ** 2).sum()) # prove that not all entries are 0 ;) - This has a negative result due to an overflow