Question

我试图通过使用薄板样条算法在英国生成网格化降雨数据，并消除R中没有超过陆地的值 - 这是我到目前为止只能手动完成的过程。这个问题对我来说很有挑战性，甚至难以解释 - 所以我将介绍到目前为止所做的工作。我们非常欢迎任何帮助。

首先，我将数据表加载到R中，该数据表表示来自多个点位置气象站的一天中的降雨量，并且数据表的每一行包含日期，站点的ID，东侧和北侧。车站，该地点的每日降雨量和一年中的平均降雨量。我还加载了库字段，maptools和gstat。

library(fields)
library(maptools)
library(gstat)

dat <- read.table("1961month1day1.csv", header=T, sep=",", quote = "")
names(dat) <- c("easting", "northing", "dailyrainfall","avaerageyearlyrainfall")

以下是数据样本：

dput(head(dat, 20))
structure(list(easting = c(130000L, 145000L, 155000L, 170000L, 
180000L, 180000L, 180000L, 180000L, 185000L, 200000L, 200000L, 
205000L, 210000L, 220000L, 225000L, 230000L, 230000L, 230000L, 
230000L, 235000L), northing = c(660000L, 30000L, 735000L, 40000L, 
30000L, 45000L, 60000L, 750000L, 725000L, 50000L, 845000L, 65000L, 
770000L, 105000L, 670000L, 100000L, 620000L, 680000L, 95000L, 
120000L), dailyrainfall = c(9.4, 4.1, 12.4, 2.8, 1.3, 3.6, 4.8, 26.7, 19.8, 
4.6, 1.7, 4.1, 12.7, 1.8, 3, 5.3, 1, 1.5, 1.5, 4.6), averageyearlyrainfall = c(1334.626923, 
1123.051923, 2072.030769, 1207.584615, 928, 1089.334615, 880.0884615, 
2810.323077, 1933.719231, 1215.642308, 2644.171154, 1235.913462, 
2140.111538, 1010.436538, 1778.432692, 1116.934615, 912.2807692, 
1579.386538, 1085.498077, 1250.601923)), .Names = c("easting", 
"northing", "dailyrainfall", "averageyearlyrainfall"), row.names = c(NA, 20L), class = "data.frame")

然后我可以在数据上插入一个薄板样条，以便给我一个网格化表面并绘制表面：

fit <- Tps(cbind(dat$easting,dat$northing),dat$dailyrainfall)
surface(fit)

然后我可以使用以下方法以1公里的步长创建英国网格：

xvals <- seq(0, 700000, by=1000)
yvals <- seq(0, 1250000, by=1000)

然后将曲面绘制到此网格上并将数据写入表格中：

griddf <- expand.grid(xvals, yvals)
griddf$pred <- predict(fit, x=as.matrix(griddf))
write.table(griddf, file="1Jan1961grid.csv", sep=",", qmethod="double")

很棒 - 到目前为止一切顺利。我现在已经在整个0到700000（E）和0到1250000（N）网格上将我的点数据转换为1km网格化数据。书面数据表是一个包含索引，东向，北向和预测降雨值的列表。

现在面临的挑战 - 我希望从列表中删除任何不在陆地上的值。我可以通过将数据加载到excel（或Access）并将数据与包含相同网格的另一个文件和年平均降雨量（该文件称为1kmgridaveragerainfall.csv）进行比较来手动实现此目的。以下是此文件的示例：

dput(head(dat1, 20))
structure(list(easting = c(-200000L, -200000L, -200000L, -200000L, 
-200000L, -200000L, -200000L, -200000L, -200000L, -200000L, -200000L, 
-200000L, -200000L, -200000L, -200000L, -200000L, -200000L, -200000L, 
-200000L, -200000L), northing = c(1245000L, 1240000L, 1235000L, 
 1230000L, 1225000L, 1220000L, 1215000L, 1210000L, 1205000L, 1200000L, 
 1195000L, 1190000L, 1185000L, 1180000L, 1175000L, 1170000L, 1165000L, 
 1160000L, 1155000L, 1150000L), averageyearlyrainfall = c(-9999, -9999, -9999, 
 -9999, -9999, -9999, -9999, -9999, -9999, -9999, -9999, -9999, 
 -9999, -9999, -9999, -9999, -9999, -9999, -9999, -9999)), .Names = c("easting", 
 "northing", "averageyearlyrainfall"), row.names = c(NA, 20L), class = "data.frame")

任何不在陆地上的网格平均年降雨量为-9999。因此，一旦匹配（即使用vlookup或Access中的查询），我可以过滤掉具有此-9999值的值，这样我就会得到一个数据表，该数据表仅具有东向和北向以及每日降雨量和平均年降雨量。然后我可以将其加载回R并使用以下方式绘制：

quilt.plot(cbind(dat$easting,dat$northing),dat$mm, add.legend=TRUE, nx=654, ny=1209,xlim=c(0,700000),ylim=c(0,1200000))

我留下了英国陆地（而不是海域）的降雨量。

那么，任何人都可以提出一种方法来实现相同但没有使用excel或访问的所有过滤等，即只能使用R来实现吗？有没有办法在开始时将两个数据表加载到R中，并以某种方式将点数据的TPS与平均数据拟合，以便不绘制等于-9999的网格方块。

我知道TPS可以使用协变量（Z）进行加权 - 这有用吗？即。

fit <- Tps(cbind(dat$easting,dat$northing),dat$dailyrainfall, Z=dat$averageyearlyrainfall)

另外，当我执行原始TPS的曲面（拟合）时，如何将曲线扩展到曲线的边缘 - 我确定我已经读过这样的地方你把像interp = TRUE这样的东西但这个不起作用。

非常感谢任何帮助

谢谢，Tony

Answer 1

如果您已经达到了拥有两个数据帧的程度，那么您应该能够将它们合并到一个新的数据帧中并对结果进行过滤/子集化。

set.seed(1234) # for reproducibility

# "The written data table is a list containing an index, an easting,
# a northing and the predicted rainfall value"
# Create a simple data frame containing made-up data
mydf1 <- data.frame(index = 1:10,
                    easting = c(1, 1, 3, 4, 5, 5, 5, 5, 6, 6),
                    northing = c(12, 13, 13, 13, 14, 14, 15, 17, 18, 20),
                    predicted = runif(10, 500, 1000))

# "...comparing the data to another file that contains the same grid
# and the average yearly rainfall"
# Second data frame is similar, but has rainfall instead of predicted
mydf2 <- data.frame(index = 1:10,
                    easting = c(1, 1, 3, 4, 5, 5, 5, 5, 6, 6),
                    northing = c(12, 13, 13, 13, 14, 14, 15, 17, 18, 20),
                    rainfall = c(runif(9, 500, 1000), -9999))

# If data frames are of same size and have mostly common columns,
# merging them probably makes it easy to manipulate the data
mydf.merged <- merge(mydf1, mydf2)

# Finally, filter the merged data frame so that it only contains
# rainfall values that are not the -9999 value that denotes sea
mydf.final <- mydf.merged[mydf.merged$rainfall > -9999, ]

这是第一个数据框：

> mydf1
   index easting northing predicted
1      1       1       12  556.8517
2      2       1       13  811.1497
3      3       3       13  804.6374
4      4       4       13  811.6897
5      5       5       14  930.4577
6      6       5       14  820.1553
7      7       5       15  504.7479
8      8       5       17  616.2753
9      9       6       18  833.0419
10    10       6       20  757.1256
>

这是第二个数据帧：

> mydf2
   index easting northing   rainfall
1      1       1       12   846.7956
2      2       1       13   772.4874
3      3       3       13   641.3668
4      4       4       13   961.7167
5      5       5       14   646.1579
6      6       5       14   918.6478
7      7       5       15   643.1116
8      8       5       17   633.4104
9      9       6       18   593.3614
10    10       6       20 -9999.0000
>

合并数据框：

> mydf.merged
   index easting northing predicted   rainfall
1      1       1       12  556.8517   846.7956
2     10       6       20  757.1256 -9999.0000
3      2       1       13  811.1497   772.4874
4      3       3       13  804.6374   641.3668
5      4       4       13  811.6897   961.7167
6      5       5       14  930.4577   646.1579
7      6       5       14  820.1553   918.6478
8      7       5       15  504.7479   643.1116
9      8       5       17  616.2753   633.4104
10     9       6       18  833.0419   593.3614
>

删除了-9999行的最终数据框：

> mydf.final
   index easting northing predicted rainfall
1      1       1       12  556.8517 846.7956
3      2       1       13  811.1497 772.4874
4      3       3       13  804.6374 641.3668
5      4       4       13  811.6897 961.7167
6      5       5       14  930.4577 646.1579
7      6       5       14  820.1553 918.6478
8      7       5       15  504.7479 643.1116
9      8       5       17  616.2753 633.4104
10     9       6       18  833.0419 593.3614
>

Answer 2

好的，我们无法复制您的数据，所以这里有几个示例：

首先制作一个矩阵，其中包含日常平均降雨量数据，-9999标记非陆地：

> m=matrix(1:12,3,4)
> m[2,1]=-9999
> m[2,3]=-9999
> m
      [,1] [,2]  [,3] [,4]
[1,]     1    4     7   10
[2,] -9999    5 -9999   11
[3,]     3    6     9   12

然后制作一个矩阵，它是你的价值网格：

> r=matrix(runif(12),3,4)
> r
          [,1]      [,2]      [,3]      [,4]
[1,] 0.9410278 0.3333299 0.5925126 0.3803659
[2,] 0.9169051 0.9797365 0.6504944 0.3154179
[3,] 0.9130946 0.7032607 0.5418443 0.8637259

现在我们想要替换r中m的{99}值为NA的所有值：

> r
          [,1]      [,2]      [,3]      [,4]
[1,] 0.9410278 0.3333299 0.5925126 0.3803659
[2,]        NA 0.9797365        NA 0.3154179
[3,] 0.9130946 0.7032607 0.5418443 0.8637259

现在，如果您可以将其转换为数据对象，那么它的工作就完成了，对吧？

如何在R中生成网格化输出并消除不在陆地上的网格方块？

2 个答案: