Question

我有一个分类栅格，我正在阅读一个numpy数组。（n班）

我想使用2d移动窗口（例如3乘3）来创建n维向量，该向量存储窗口内每个类的％覆盖。因为栅格很大，所以存储这些信息以便每次都不重新计算它是有用的....因此我认为最好的解决方案是创建一个3d数组作为矢量。将根据这些％/计数值创建新栅格。

我的想法是：

1）创建一个3d数组n + 1'band'

2）波段1 =原始分类栅格。彼此'band'值=计算窗口内值的单元格（即每个类别一个带）....例如：

[[2 0 1 2 1]
 [2 0 2 0 0]
 [0 1 1 2 1]
 [0 2 2 1 1]
 [0 1 2 1 1]]
[[2 2 3 2 2]
 [3 3 3 2 2]
 [3 3 2 2 2]
 [3 3 0 0 0]
 [2 2 0 0 0]]
[[0 1 1 2 1]
 [1 3 3 4 2]
 [1 2 3 4 3]
 [2 3 5 6 5]
 [1 1 3 4 4]]
[[2 3 2 2 1]
 [2 3 3 3 2]
 [2 4 4 3 1]
 [1 3 5 3 1]
 [1 3 3 2 0]]

4）将这些频段读入vrt，因此只需要创建一次......并且可以读入更多模块。

问题：在窗口内“计算”最有效的“移动窗口”方法是什么？

目前 - 我正在尝试，但未能使用以下代码：

def lcc_binary_vrt(raster, dim, bands):
    footprint = np.zeros(shape = (dim,dim), dtype = int)+1
    g = gdal.Open(raster)
    data = gdal_array.DatasetReadAsArray(g)

    #loop through the band values
    for i in bands:   
        print i
        # create a duplicate '0' array of the raster
        a_band = data*0
        # we create the binary dataset for the band        
        a_band = np.where(data == i, 1, a_band)
        count_a_band_fname = raster[:-4] + '_' + str(i) + '.tif'        
        # run the moving window (footprint) accross the band to create a 'count'
        count_a_band = ndimage.generic_filter(a_band, np.count_nonzero(x), footprint=footprint, mode = 'constant')
        geoTiff.create(count_a_band_fname, g, data, count_a_band, gdal.GDT_Byte, np.nan)

非常感谢任何建议。

贝基

Answer 1

我对空间科学知识一无所知，所以我只关注主要问题:)

在窗口内“计算”最有效的“移动窗口”方法是什么？

使用Numpy移动窗口统计信息的常用方法是使用numpy.lib.stride_tricks.as_strided，例如参见this answer。基本上，我们的想法是创建一个包含所有窗口的数组，而不会增加内存使用量：

from numpy.lib.stride_tricks import as_strided

...

m, n = a_band.shape
newshape = (m-dim+1, n-dim+1, dim, dim)
newstrides = a_band.strides * 2  # strides is a tuple
count_a_band = as_strided(ar, newshape, newstrides).sum(axis=(2,3))

但是，对于您的用例，此方法效率低下，因为您反复对相同的数字求和，尤其是在窗口大小增加的情况下。更好的方法是使用cumsum技巧，例如this answer：

def windowed_sum_1d(ar, ws, axis=None):

    if axis is None:
        ar = ar.ravel()
    else:
        ar = np.swapaxes(ar, axis, 0)

    ans = np.cumsum(ar, axis=0)
    ans[ws:] = ans[ws:] - ans[:-ws]

    ans = ans[ws-1:]

    if axis:
        ans = np.swapaxes(ans, 0, axis)

    return ans


def windowed_sum(ar, ws):
    for axis in range(ar.ndim):
        ar = windowed_sum_1d(ar, ws, axis)
    return ar

...

count_a_band = windowed_sum(a_band, dim)

请注意，在上面的两个代码中，处理边缘情况将非常繁琐。幸运的是，有一种简单的方法可以包含这些并获得与第二个代码相同的效率：

count_a_band = ndimage.uniform_filter(a_band, size=dim, mode='constant') * dim**2

虽然与你已经拥有的非常相似，但速度会快得多！缺点是您可能需要舍入到整数以摆脱浮点舍入错误。

最后一点，您的代码

# create a duplicate '0' array of the raster
a_band = data*0
# we create the binary dataset for the band        
a_band = np.where(data == i, 1, a_band)

有点多余：您可以使用a_band = (data == i)。

numpy移动窗口百分比覆盖

1 个答案: