子网格的Python(numpy)2D和3D数组平均

时间:2019-05-11 07:27:00

标签: python numpy indexing slice

使用Python提取不同级别的栅格化大气数据,并将其转换为netCDF <-COMPLETED

使用Python查找区域的网格化数据,然后将该数据平均化到子网格(2x2)网格上<-不正确

我可以在Octave / Matlab中使用它,但我想将其全部保留在Python中。我认为问题在于索引语法以及我无法在索引方面击败Python提交。

数据:经度,纬度和压力水平的一维数组。经度有49个元素,纬度有13个元素,层级有12个元素。我尝试求平均的数据在第一个实例中是2D矩阵(13x49),在第二个实例中是3D矩阵(shape = 12x13x49)。

#DEFINE LARGE AREA OF GLOBLE
londim_g  = 49
latdim_g  = 13
lonmin_g  = 60
lonmax_g  = 180
latmin_g  = -60
latmax_g  = -30
dlat=dlon = 2.5
lats_g      = arange(latmin_g,latmax_g+dlon,dlon)
lons_g      = arange(lonmin_g,lonmax_g+dlat,dlat)
LON_G,LAT_G = meshgrid(lons_g,lats_g) #THE SHAPE OF THIS IS A PROBLEM!!
# DEFINE SMALLER REGION
lonmin  = 120;
lonmax  = 130;
latmax  = -40;
latmin  = -50;
N       = 2; #THIS IS NxN SUB-GRID AVERAGE OF SMALLER REGION
ind  = argwhere( (LON_G>=lonmin) & (LON_G<=lonmax) & (LAT_G<=latmax) & (LAT_G>=latmin) )
ri   = ind[:,0]; 
ci   = ind[:,1];
LON = LON_G[ix_(ri,ci)]
LAT = LAT_G[ix_(ri,ci)]
LON = LON[1].reshape(5,5) #THIS IS STEP IS A RESULT OF LON_G,LAT_G BEING MIS-SHAPEN
LAT = LAT[1].reshape(5,5) #THIS IS STEP IS A RESULT OF LON_G,LAT_G BEING MIS-SHAPEN
# AVERAGE on NxN sub-grids such that
#INDEX GRID
# 
# Essentially we averaging each sub-grid within the domain, that is each 2x2, grid points
# IF the following is the domain:
#
#      (ln1,lt1)      (ln2,lt1)     (ln3,lt1)     (ln4,lt1)     (ln5,lt1)
#
#      (ln1,lt2)      (ln2,lt2)     (ln3,lt2)     (ln4,lt2)     (ln5,lt2)
#
#      (ln1,lt3)      (ln2,lt3)     (ln3,lt3)     (ln4,lt3)     (ln5,lt3)
#
#      (ln1,lt4)      (ln2,lt4)     (ln3,lt4)     (ln4,lt4)     (ln5,lt4)
#
#      (ln1,lt5)      (ln2,lt5)     (ln3,lt5)     (ln4,lt5)     (ln5,lt5)
#
# then the first sub-grid is:
#
#      (ln1,lt1)      (ln2,lt1) 
#
#      (ln1,lt2)      (ln2,lt2) 
#
# the next sub-grid is:
#
#      (ln2,lt1)     (ln3,lt1)
#
#      (ln2,lt2)     (ln3,lt2)
#
# So on, and so forth. If we associate each grid point with it's data then compute the average
# value of that sub-grid then we will have an `array', in this of 16 mean values, i.e.:
#
#      (ln1,lt1)      (ln2,lt1)     (ln3,lt1)     (ln4,lt1)     (ln5,lt1)
#               mean1          mean2         mean3         mean4
#      (ln1,lt2)      (ln2,lt2)     (ln3,lt2)     (ln4,lt2)     (ln5,lt2)
#               mean5          mean6         mean7         mean8
#      (ln1,lt3)      (ln2,lt3)     (ln3,lt3)     (ln4,lt3)     (ln5,lt3)
#               mean9          mean10        mean11        mean12
#      (ln1,lt4)      (ln2,lt4)     (ln3,lt4)     (ln4,lt4)     (ln5,lt4)
#               mean13         mean14        mean15        mean16
#      (ln1,lt5)      (ln2,lt5)     (ln3,lt5)     (ln4,lt5)     (ln5,lt5)
#
# We then take the mean of those means to get the mean of domain/region of each level.
# In doing the mean this way the over-lap in averaging towards the interior values provides
# more weight to those values and hence a more statistically significant mean for the
# the region.
#
TROP = trop[ix_(ri,ci)]
TROP = TROP[1].reshape(5,5) #Hmmm, I FEEL LIKE I'M REALLY NOT UNDERSTANDING PYTHON INDEXING
n,m = TROP.shape
TROP_BAR = average(split(average(split(TROP, m // N, axis=1), axis=-1), n // N, axis=1), axis=-1)
print(TROP_BAR)
OMEGA_BAR = zeros(12)
for i1 in range (0,11):
    oms = om[i1]
    OMS = OMS[ix_(ri,ci)]
    OMS = OMS[1].reshape(5,5)
    OMEGA_BAR[i1] = average(split(average(split(, m // N, axis=1), axis=-1), n // N, axis=1), axis=-1)

我得到的平均值没有意义。因此,我想获得实际上有意义的平均值。预先感谢。

2 个答案:

答案 0 :(得分:1)

尽管我认为这不是最有效的方法。我已经找到了可以给我正确答案的解决方案,因此我将其发布在这里(下)。但是,我想知道,除了循环遍历矩阵之外,还有谁拥有更有效的解决方案。再次感谢。

src

答案 1 :(得分:0)

您可以检查以下代码吗:

import numpy as np
def gridavg(testin):
    testin=np.array(testin)
    test_a=0.5*(testin[:-1,:]+testin[1:,:])
    testout=0.5*(test_a[:,:-1]+test_a[:,1:]);
    return testout

不幸的是,这仅适用于2x2矩阵,但是应该更快,因为您将使用NumPy数组和矩阵运算进行平均。

对于更通用的方法,您可以尝试以下方法:

def gridavg(testin,n,k):
    testin=np.array(testin);
    from_end=1-k;
    if k>0:
        sum_a = None;
        for i0 in range(k):
            if sum_a is None:
                sum_a = np.array(testin[i0:(from_end+n),:]);
            else:
                sum_a = sum_a + testin[i0:(from_end+i0+n),:]
        sum_a = sum_a/float(k);

        sum_b = None;
        for j0 in range(k):
            if sum_b is None:
                sum_b = sum_a[:,j0:(from_end+n)];
            else:
                sum_b = sum_b + sum_a[:,j0:(from_end+j0+n)]
        testout = sum_b/float(k);
    return testout

我尝试使用随机矩阵,并且对于k值2和3似乎起作用。