ComputeBandStats是否考虑了nodata?

时间:2016-06-16 11:00:09

标签: python python-2.7 python-3.x numpy gdal

我正在尝试计算仅部分由数据覆盖的图像的统计数据。我想知道ComputeBandStats是否忽略与文件nodata具有相同值的像素。

这是我的代码:

inIMG = gdal.Open(infile)

# getting stats for the first 3 bands
# Using ComputeBandStats insted of stats array has min, max, mean and sd values
print "Computing band statistics"
bandas = [inIMG.GetRasterBand(b+1) for b in range(3)]
minMax = [b.ComputeRasterMinMax() for b in bandas]
meanSD = [b.ComputeBandStats(1) for b in bandas]
print minMax
print meanSD

对于没有nodata属性的图像,输出为:

Computing band statistics
[(0.0, 26046.0), (0.0, 24439.0), (0.0, 22856.0)]
[(762.9534697777777, 647.9056493556284), (767.642869, 516.0531530834181), (818.0449643333334, 511.5360132592902)]

对于nodata = 0的图像,输出为:

Computing band statistics
[(121.0, 26046.0), (202.0, 24439.0), (79.0, 22856.0)]
[(762.9534697777777, 647.9056493556284), (767.642869, 516.0531530834181), (818.0449643333334, 511.5360132592902)]

最小值和最大值已经改变,使得0不再是min,这是有道理的,因为在第二个版本中它是nodata,因此不被ComputeRasterMinMax()视为。但是,平均值和标准差没有改变。

这是否意味着ComputeBandStats不会忽略nodata值?
有没有办法强制ComputeBandStats忽略nodata值?

1 个答案:

答案 0 :(得分:1)

设置NoData值对数据本身没有影响。你可以这样试试:

# First image, all valid data
data = numpy.random.randint(1,10,(10,10))
driver = gdal.GetDriverByName('GTIFF')
ds = driver.Create("stats1.tif", 10, 10, 1, gdal.GDT_Byte)
ds.GetRasterBand(1).WriteArray(data)
print ds.GetRasterBand(1).ComputeBandStats(1)
print ds.GetRasterBand(1).ComputeStatistics(False)
ds = None

# Second image, values of "1" set to no data
driver = gdal.GetDriverByName('GTIFF')
ds = driver.Create("stats2.tif", 10, 10, 1, gdal.GDT_Byte)
ds.GetRasterBand(1).SetNoDataValue(1)
ds.GetRasterBand(1).WriteArray(data)
print ds.GetRasterBand(1).ComputeBandStats(1)
print ds.GetRasterBand(1).ComputeStatistics(False)
ds = None

请注意,ComputeBandStats返回的统计信息未更改,但ComputeStatistics返回的统计信息为:

>>> (4.97, 2.451346568725035)
>>> [1.0, 9.0, 4.970000000000001, 2.4513465687250346]

>>> (4.97, 2.451346568725035)
>>> [2.0, 9.0, 5.411111111111111, 2.1750833672117]

您可以手动确认统计信息是否正确:

numpy.mean(data)
numpy.mean(data[data != 1])
numpy.std(data)
numpy.std(data[data != 1])

>>> 4.9699999999999998
>>> 5.4111111111111114
>>> 2.4513465687250346
>>> 2.1750833672117