如何从for循环结果中填充R中的空矩阵?

时间:2017-04-07 01:29:52

标签: r

我正在计算25年来所有49681个站点的年平均NDVI值。我创建了一个for循环,但我无法弄清楚如何填充空的49681 x 25矩阵。我的代码现在只填充矩阵的第一列。对于如何解决这个问题,有任何的建议吗?

A sample of my data

yearly.avg <- matrix (nrow=49681, ncol=25)
for (i in 1:49681) {
yearly.avg[i] <- mean(as.numeric(veg.data[i, 4:603]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,4:27]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,28:51]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,52:75]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,76:99]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,100:123]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,124:147]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,148:171]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,172:195]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,196:219]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,220:243]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,244:267]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,268:291]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,292:315]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,316:339]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,340:363]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,364:387]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,388:411]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,412:435]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,436:459]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,460:483]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,484:507]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,508:531]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,532:555]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,556:579]))
yearly.avg[i] <- mean(as.numeric(veg.data[i,580:603]))
}
head(yearly.avg)

2 个答案:

答案 0 :(得分:1)

除了for循环外,还有更好的方法可以做到这一点,但对于初学者,你试图将26组值分配到25列。你也实际上告诉R用一个i行填充一个列,26次,其中25次覆盖以前的列值。我也对你使用的范围感到困惑,因为它的长度为'23'。抛开所有这些并按原样回答你的问题,你会像这样编写一个for循环:

for(i in 1:49681){
    yearly.avg[i,1] <- mean(as.numeric(veg.data[i,4:603]))
    yearly.avg[i,2] <- mean(as.numeric(veg.data[i,4:27]))
    ...
}

虽然我可以向你保证,有一种更好的方法来完成你想要做的事情。您需要更多关于数据集的信息以及您想要的结果格式,以帮助您获得最佳方法。

答案 1 :(得分:1)

根据this answer,您可以执行以下操作:

# make some fake data and set up the structure
siteCount <- 49681
yearCount <- 25
monthCount <- 12
obsPerMonth <- 2
obsCount <- yearCount * monthCount * obsPerMonth
fakeData <- sample(100:500, size=siteCount * obsCount, replace=TRUE)
veg.data <- matrix(fakeData, nrow=siteCount, ncol=obsCount)

sitenum <- sprintf("N%05d", (1:49681)+9000)
lat <- seq(from=40.90952, by=0.08, length.out=length(sitenum))
long <- seq(from=2.755276, by=0.08, length.out=length(sitenum))

veg.data <- as.data.frame(cbind(sitenum, lat, long, veg.data), stringsAsFactors=FALSE)
v <- expand.grid(c('A', 'B'), sprintf("%02d", 1:monthCount), 1982:2006)
dataColNames <- paste('Y', v[, 3], '.', v[, 2], v[, 1], sep='')

colnames(veg.data) <- c('sitenum', 'x', 'y', dataColNames)

###
# we now have the sample data, we can calculate yearly means
###

# First, get a numeric matrix of just the veg data
veg.data2 <- as.matrix(veg.data[, 4:ncol(veg.data)])
storage.mode(veg.data2) <- "numeric"

# change the column headings to be just the year, so that we can average based on year
colnames(veg.data2) <- substring(colnames(veg.data2), 1, 5)

# now, calculate yearly averages
yearly.avg <- sapply(unique(colnames(veg.data2)), function(x) 
      rowMeans(veg.data2[,colnames(veg.data2)== x,drop=FALSE], na.rm=TRUE))

# have a look
head(yearly.avg)
         1982     1983     1984     1985     1986     1987     1988     1989     1990     1991     1992     1993
[1,] 325.2083 363.7500 283.6250 315.6667 289.7500 260.7917 297.0000 301.5833 285.9167 299.2083 264.9167 311.2083
[2,] 307.6250 287.7500 281.3750 296.5833 330.7083 268.2917 331.5417 309.6667 275.7917 300.5833 287.9583 291.2500
[3,] 272.5000 295.9167 302.1250 314.7083 270.6667 340.2917 287.1250 336.3333 309.2500 266.7500 273.5000 254.2917
[4,] 288.9167 280.7083 299.1667 279.5833 301.4583 283.7917 274.6667 295.6250 238.6250 324.7917 302.2083 283.1667
[5,] 280.8750 282.5833 294.7083 276.0417 303.2917 266.5000 324.9583 301.5417 266.2917 327.0417 295.7083 262.7917
[6,] 275.0833 321.5833 305.1250 308.5417 266.7917 304.2083 304.1250 290.1667 312.9167 266.5000 273.7500 314.2917
         1994     1995     1996     1997     1998     1999     2000     2001     2002     2003     2004     2005
[1,] 320.7083 339.9583 288.9167 329.3750 303.6667 290.0417 288.3333 299.0417 290.3333 315.2500 272.5833 303.1667
[2,] 336.7500 295.0000 301.7917 303.0000 294.7917 337.5417 328.1250 284.5417 301.3333 300.6667 302.7083 288.7917
[3,] 314.2083 313.7500 325.0417 290.2917 276.6250 262.7500 315.7500 267.9167 301.8750 312.3333 288.1667 308.5000
[4,] 283.1667 278.8750 300.3333 278.3333 291.7500 358.2500 326.5833 311.7500 248.8750 250.8333 316.5000 324.0417
[5,] 286.9167 290.7500 331.7500 330.2500 317.5417 326.0417 297.8750 307.4583 371.9583 323.9583 320.5833 290.3750
[6,] 290.5000 306.0833 238.0833 304.7083 300.0417 252.3333 261.1250 253.9167 274.2083 282.8750 326.8750 306.1250
         2006
[1,] 308.0000
[2,] 298.4167
[3,] 293.2083
[4,] 308.0417
[5,] 305.6250
[6,] 297.1667

# Manually calculate average for 1982 to check result
d <- as.matrix(veg.data[, 4:27])
storage.mode(d) <- "numeric"
head(rowMeans(d))
[1] 325.2083 307.6250 272.5000 288.9167 280.8750 275.0833

head(rowMeans(d)) == head(yearly.avg[, 'Y1982'])
[1] TRUE TRUE TRUE TRUE TRUE TRUE