我有一些二维numpy数据,我想使用映射的平均值(或其他一些统计量)转换为更高维的数组。
源数据是MxN形状的2D,我想将其映射到4D数组(AxBxCxD形状)上。从源数据到四个维度中每个维度的索引映射是通过2D(MxN形状)变量或平铺的1D(Mx1形状)变量创建的。
下面是我正在尝试做的一个可行示例。尽管这似乎可行,但我想知道是否有一个函数可以让我:
1)使用for循环和
2)允许目标数组具有可变数量的尺寸(3D,4D,5D等)。
import numpy as np
# create data I want to conditionally average (MxN array)
zz = np.random.rand(100,10)
# create variables used to define binning for conditional averaging
# each variable defines one dimension of the final 4 dimensional array
aa = np.random.rand(100,10)*10
bb = np.random.rand(100,1) + 5
cc = np.random.rand(100,1) * 25
dd = np.random.rand(100,1)* 50 + 100
# define binning boundaries
binsaa = np.array([2, 4, 6, 8])
binsbb = np.array([5.1, 5.5, 5.7])
binscc = np.array([12])
binsdd = np.array([110, 133])
# create bin indicies
idaa = np.digitize(aa,binsaa,right=True)
idbb = np.digitize(bb,binsbb,right=True)
idcc = np.digitize(cc,binscc,right=True)
iddd = np.digitize(dd,binsdd,right=True)
# tile some of the indicies so they match the shape of the data to be averaged
idbbt = np.tile(idbb,[1,10])
idcct = np.tile(idcc,[1,10])
idddt = np.tile(iddd,[1,10])
# make empty destination 4 dimensional arrays
avgxx = np.zeros([5,4,2,3])
cntxx = np.zeros([5,4,2,3])
# use for loops to average original data and place in 4-dim array
for ixa in range(5):
for ixb in range(4):
for ixc in range(2):
for ixd in range(3):
idz = (idaa == ixa) & (idbbt == ixb) & (idcct == ixc) & (idddt == ixd)
avgxx[ixa,ixb,ixc,ixd] = np.average(zz[idz])
cntxx[ixa,ixb,ixc,ixd] = np.sum(idz)
print(avgxx[:,:,:,:])
print(cntxx[:,:,:,:])