当我使用predict()和保存的gbm模型对象进行预测时,我无法准确地再现我的结果。我正在使用光栅包来预测栅格。每次我使用相同的模型对象和输入(光栅堆栈)运行predict()时,我会得到略微不同的值(例如,对于从0.08到12.30的预测范围,最大值在0.7以内)。但是,结果似乎有限。例如,如果我运行预测足够多次,我可以重现结果。
问题似乎在predict()函数中,因为它似乎与R会话,加载包库等无关。
预测函数中是否存在可能导致此行为的随机数生成器或已知错误?
这是我的代码:
library(raster)
library(rgdal)
library(gbm)
# read in the saved gbm model object
final <- readRDS("final_gbm_model")
# get the raster file names from the directory for the predictors
hab = list.files(getwd(), pattern="txt$", full.names=FALSE)
# import as raster stack
rstack <- stack(hab)
# run the predictions. One of my predictors is a factor character that is constant everywhere, so I use the const option:
pred <- predict(rstack, final, n.trees=final$n.trees,na.rm=TRUE,const=data.frame(WaterUse2="H")) # family="gaussian"
#Not sure if I need the family = "gaussian" option. It doesn't seem to affect the results.
# do some model specific back transformation on the predictions
e <- exp(1)
pred <- e^(pred)* 1.506951
#defining the projection for raster
albers<-CRS("+proj=aea +datum=NAD83 +lat_1=29.5 +lat_2=45.5 +lat_0=23.0 +lon_0=-96.0 +x_0=0 +y_0=0")
projection(pred) <- albers
projection(pred, asText = TRUE)
# Export prediction grid to geotiff
writeRaster(pred,filename="preds.tif",format="GTiff",overwrite=TRUE)
然后当我在ArcGIS中打开栅格时,我注意到每次都会得到略微不同的摘要统计数据,但并非总是如此。这是我保存的模型对象和栅格堆栈的保管箱链接:https://www.dropbox.com/s/cb5nybhb6ioc8d9/rstack_model.zip?dl=0。可以使用readRDS()读取它们。
当我在两个不同的R会话中创建的栅格上运行gdalUtils :: gdalinfo(yourfile)时,我得到的是相同的模型文件和输入:
gdalUtils :: gdalinfo( “preds2_NO3_TD_Pubs.tif”)
[1] "Driver: GTiff/GeoTIFF"
[2] "Files: preds2_NO3_TD_Pubs.tif"
[3] "Size is 250, 718"
[4] "Coordinate System is:"
[5] "PROJCS[\"unnamed\","
[6] " GEOGCS[\"NAD83\","
[7] " DATUM[\"North_American_Datum_1983\","
[8] " SPHEROID[\"GRS 1980\",6378137,298.2572221010042,"
[9] " AUTHORITY[\"EPSG\",\"7019\"]],"
[10] " TOWGS84[0,0,0,0,0,0,0],"
[11] " AUTHORITY[\"EPSG\",\"6269\"]],"
[12] " PRIMEM[\"Greenwich\",0],"
[13] " UNIT[\"degree\",0.0174532925199433],"
[14] " AUTHORITY[\"EPSG\",\"4269\"]],"
[15] " PROJECTION[\"Albers_Conic_Equal_Area\"],"
[16] " PARAMETER[\"standard_parallel_1\",29.5],"
[17] " PARAMETER[\"standard_parallel_2\",45.5],"
[18] " PARAMETER[\"latitude_of_center\",23],"
[19] " PARAMETER[\"longitude_of_center\",-96],"
[20] " PARAMETER[\"false_easting\",0],"
[21] " PARAMETER[\"false_northing\",0],"
[22] " UNIT[\"metre\",1,"
[23] " AUTHORITY[\"EPSG\",\"9001\"]]]"
[24] "Origin = (-2256000.000000000000000,2275000.000000000000000)"
[25] "Pixel Size = (1000.000000000000000,-1000.000000000000000)"
[26] "Metadata:"
[27] " AREA_OR_POINT=Area"
[28] "Image Structure Metadata:"
[29] " COMPRESSION=LZW"
[30] " INTERLEAVE=BAND"
[31] "Corner Coordinates:"
[32] "Upper Left (-2256000.000, 2275000.000) (123d14'22.95\"W, 40d33'27.36\"N)"
[33] "Lower Left (-2256000.000, 1557000.000) (121d 0'52.48\"W, 34d23'17.23\"N)"
[34] "Upper Right (-2006000.000, 2275000.000) (120d21'32.84\"W, 41d 9'22.70\"N)"
[35] "Lower Right (-2006000.000, 1557000.000) (118d20'58.37\"W, 34d56'21.98\"N)"
[36] "Center (-2131000.000, 1916000.000) (120d42' 6.60\"W, 37d46'25.81\"N)"
[37] "Band 1 Block=250x4 Type=Float64, ColorInterp=Gray"
[38] " Min=0.077 Max=11.509 "
[39] " Minimum=0.077, Maximum=11.509, Mean=2.150, StdDev=1.373"
[40] " NoData Value=-1.6999999999999999e+308"
[41] " Metadata:"
[42] " STATISTICS_MAXIMUM=11.508599723736"
[43] " STATISTICS_MEAN=2.1504374380146"
[44] " STATISTICS_MINIMUM=0.077388949363546"
[45] " STATISTICS_STDDEV=1.3731923433979"
和另一次运行
gdalUtils::gdalinfo("preds3_NO3_TD_Pubs.tif")
[1] "Driver: GTiff/GeoTIFF"
[2] "Files: preds3_NO3_TD_Pubs.tif"
[3] "Size is 250, 718"
[4] "Coordinate System is:"
[5] "PROJCS[\"unnamed\","
[6] " GEOGCS[\"NAD83\","
[7] " DATUM[\"North_American_Datum_1983\","
[8] " SPHEROID[\"GRS 1980\",6378137,298.2572221010042,"
[9] " AUTHORITY[\"EPSG\",\"7019\"]],"
[10] " TOWGS84[0,0,0,0,0,0,0],"
[11] " AUTHORITY[\"EPSG\",\"6269\"]],"
[12] " PRIMEM[\"Greenwich\",0],"
[13] " UNIT[\"degree\",0.0174532925199433],"
[14] " AUTHORITY[\"EPSG\",\"4269\"]],"
[15] " PROJECTION[\"Albers_Conic_Equal_Area\"],"
[16] " PARAMETER[\"standard_parallel_1\",29.5],"
[17] " PARAMETER[\"standard_parallel_2\",45.5],"
[18] " PARAMETER[\"latitude_of_center\",23],"
[19] " PARAMETER[\"longitude_of_center\",-96],"
[20] " PARAMETER[\"false_easting\",0],"
[21] " PARAMETER[\"false_northing\",0],"
[22] " UNIT[\"metre\",1,"
[23] " AUTHORITY[\"EPSG\",\"9001\"]]]"
[24] "Origin = (-2256000.000000000000000,2275000.000000000000000)"
[25] "Pixel Size = (1000.000000000000000,-1000.000000000000000)"
[26] "Metadata:"
[27] " AREA_OR_POINT=Area"
[28] "Image Structure Metadata:"
[29] " COMPRESSION=LZW"
[30] " INTERLEAVE=BAND"
[31] "Corner Coordinates:"
[32] "Upper Left (-2256000.000, 2275000.000) (123d14'22.95\"W, 40d33'27.36\"N)"
[33] "Lower Left (-2256000.000, 1557000.000) (121d 0'52.48\"W, 34d23'17.23\"N)"
[34] "Upper Right (-2006000.000, 2275000.000) (120d21'32.84\"W, 41d 9'22.70\"N)"
[35] "Lower Right (-2006000.000, 1557000.000) (118d20'58.37\"W, 34d56'21.98\"N)"
[36] "Center (-2131000.000, 1916000.000) (120d42' 6.60\"W, 37d46'25.81\"N)"
[37] "Band 1 Block=250x4 Type=Float64, ColorInterp=Gray"
[38] " Min=0.077 Max=11.509 "
[39] " Minimum=0.077, Maximum=11.509, Mean=2.202, StdDev=1.389"
[40] " NoData Value=-1.6999999999999999e+308"
[41] " Metadata:"
[42] " STATISTICS_MAXIMUM=11.508599723736"
[43] " STATISTICS_MEAN=2.2022304968129"
[44] " STATISTICS_MINIMUM=0.077388949363546"
[45] " STATISTICS_STDDEV=1.3888495428978"
还有另一次运行:
gdalUtils::gdalinfo("preds4_NO3_TD_Pubs.tif")
[1] "Driver: GTiff/GeoTIFF"
[2] "Files: preds4_NO3_TD_Pubs.tif"
[3] "Size is 250, 718"
[4] "Coordinate System is:"
[5] "PROJCS[\"unnamed\","
[6] " GEOGCS[\"NAD83\","
[7] " DATUM[\"North_American_Datum_1983\","
[8] " SPHEROID[\"GRS 1980\",6378137,298.2572221010042,"
[9] " AUTHORITY[\"EPSG\",\"7019\"]],"
[10] " TOWGS84[0,0,0,0,0,0,0],"
[11] " AUTHORITY[\"EPSG\",\"6269\"]],"
[12] " PRIMEM[\"Greenwich\",0],"
[13] " UNIT[\"degree\",0.0174532925199433],"
[14] " AUTHORITY[\"EPSG\",\"4269\"]],"
[15] " PROJECTION[\"Albers_Conic_Equal_Area\"],"
[16] " PARAMETER[\"standard_parallel_1\",29.5],"
[17] " PARAMETER[\"standard_parallel_2\",45.5],"
[18] " PARAMETER[\"latitude_of_center\",23],"
[19] " PARAMETER[\"longitude_of_center\",-96],"
[20] " PARAMETER[\"false_easting\",0],"
[21] " PARAMETER[\"false_northing\",0],"
[22] " UNIT[\"metre\",1,"
[23] " AUTHORITY[\"EPSG\",\"9001\"]]]"
[24] "Origin = (-2256000.000000000000000,2275000.000000000000000)"
[25] "Pixel Size = (1000.000000000000000,-1000.000000000000000)"
[26] "Metadata:"
[27] " AREA_OR_POINT=Area"
[28] "Image Structure Metadata:"
[29] " COMPRESSION=LZW"
[30] " INTERLEAVE=BAND"
[31] "Corner Coordinates:"
[32] "Upper Left (-2256000.000, 2275000.000) (123d14'22.95\"W, 40d33'27.36\"N)"
[33] "Lower Left (-2256000.000, 1557000.000) (121d 0'52.48\"W, 34d23'17.23\"N)"
[34] "Upper Right (-2006000.000, 2275000.000) (120d21'32.84\"W, 41d 9'22.70\"N)"
[35] "Lower Right (-2006000.000, 1557000.000) (118d20'58.37\"W, 34d56'21.98\"N)"
[36] "Center (-2131000.000, 1916000.000) (120d42' 6.60\"W, 37d46'25.81\"N)"
[37] "Band 1 Block=250x4 Type=Float64, ColorInterp=Gray"
[38] " Min=0.077 Max=11.509 "
[39] " Minimum=0.077, Maximum=11.509, Mean=2.202, StdDev=1.389"
[40] " NoData Value=-1.6999999999999999e+308"
[41] " Metadata:"
[42] " STATISTICS_MAXIMUM=11.508599723736"
[43] " STATISTICS_MEAN=2.2021866233701"
[44] " STATISTICS_MINIMUM=0.077388949363546"
[45] " STATISTICS_STDDEV=1.3888576145887"
对最终栅格值求和(在反向变换之后)给出几个不同的数字:
ras1<- raster("preds2_NO3_TD_Pubs.tif")
ras2<- raster("preds3_NO3_TD_Pubs.tif")
ras3<- raster("preds4_NO3_TD_Pubs.tif")
sum(getValues(ras1),na.rm=TRUE)
[1] 103750
sum(getValues(ras2),na.rm=TRUE)
[1] 106248.8
sum(getValues(ras3),na.rm=TRUE)
[1] 106246.7
出于某种原因,我不再获得最大值为12.445的输出,但这是我之前运行中获得的值:
gdalUtils :: gdalinfo( “preds_NO3_TD_Pubs.tif”)
[1] "Driver: GTiff/GeoTIFF"
[2] "Files: preds_NO3_TD_Pubs.tif"
[3] " preds_NO3_TD_Pubs.tif.aux.xml"
[4] "Size is 250, 718"
[5] "Coordinate System is:"
[6] "PROJCS[\"unnamed\","
[7] " GEOGCS[\"NAD83\","
[8] " DATUM[\"North_American_Datum_1983\","
[9] " SPHEROID[\"GRS 1980\",6378137,298.2572221010042,"
[10] " AUTHORITY[\"EPSG\",\"7019\"]],"
[11] " TOWGS84[0,0,0,0,0,0,0],"
[12] " AUTHORITY[\"EPSG\",\"6269\"]],"
[13] " PRIMEM[\"Greenwich\",0],"
[14] " UNIT[\"degree\",0.0174532925199433],"
[15] " AUTHORITY[\"EPSG\",\"4269\"]],"
[16] " PROJECTION[\"Albers_Conic_Equal_Area\"],"
[17] " PARAMETER[\"standard_parallel_1\",29.5],"
[18] " PARAMETER[\"standard_parallel_2\",45.5],"
[19] " PARAMETER[\"latitude_of_center\",23],"
[20] " PARAMETER[\"longitude_of_center\",-96],"
[21] " PARAMETER[\"false_easting\",0],"
[22] " PARAMETER[\"false_northing\",0],"
[23] " UNIT[\"metre\",1,"
[24] " AUTHORITY[\"EPSG\",\"9001\"]]]"
[25] "Origin = (-2256000.000000000000000,2275000.000000000000000)"
[26] "Pixel Size = (1000.000000000000000,-1000.000000000000000)"
[27] "Metadata:"
[28] " AREA_OR_POINT=Area"
[29] "Image Structure Metadata:"
[30] " COMPRESSION=LZW"
[31] " INTERLEAVE=BAND"
[32] "Corner Coordinates:"
[33] "Upper Left (-2256000.000, 2275000.000) (123d14'22.95\"W, 40d33'27.36\"N)"
[34] "Lower Left (-2256000.000, 1557000.000) (121d 0'52.48\"W, 34d23'17.23\"N)"
[35] "Upper Right (-2006000.000, 2275000.000) (120d21'32.84\"W, 41d 9'22.70\"N)"
[36] "Lower Right (-2006000.000, 1557000.000) (118d20'58.37\"W, 34d56'21.98\"N)"
[37] "Center (-2131000.000, 1916000.000) (120d42' 6.60\"W, 37d46'25.81\"N)"
[38] "Band 1 Block=250x4 Type=Float64, ColorInterp=Gray"
[39] " Min=0.077 Max=12.445 "
[40] " Minimum=0.077, Maximum=12.445, Mean=2.253, StdDev=1.420"
[41] " NoData Value=-1.6999999999999999e+308"
[42] " Metadata:"
[43] " STATISTICS_MAXIMUM=12.445399168904"
[44] " STATISTICS_MEAN=2.2528155395026"
[45] " STATISTICS_MINIMUM=0.077388949363546"
[46] " STATISTICS_STDDEV=1.4199528164987"
这是值和:
sum(getValues(raster("preds_NO3_TD_Pubs.tif")),na.rm=TRUE)
[1] 108689.3