Question

我有一个包含多个网站的数据集，每个网站都会在多年内进行抽样。作为其中的一部分，我有每年抽样的气候数据以及几个变量的计算方法（年平均温度，年平均降水量，年平均降雪深度等）。以下是数据框的实际内容：

site  date  year  temp  precip  mean.ann.temp  mn.ann.precip
a   5/1/10  2010   15       0       6                   .03
a   6/2/10  2010   18       1       6                   .03
a   7/3/10  2010   22       0       6                   .03
b   5/2/10  2010   16       2       7                   .04
b   6/3/10  2010   17       3       7                   .04
b   7/4/10  2010   20       0       7                   .04
c   5/3/10  2010   14       0       5                   .06
c   6/4/10  2010   13       0       5                   .06
c   7/8/10  2010   25       0       5                   .06
d   5/5/10  2010   16      15       10                  .2
d   6/6/10  2010   22       0       10                  .2
d   7/7/10  2010   24       0       10                  .2
...

然后它以同样的方式持续多年。

如何为每个站点和年份提取mean.ann.temp和mn.ann.precip？我试过加倍tapply（）没有成功并使用double for循环，但我似乎无法弄明白。有人能帮我吗？或者我必须用冗长而繁琐的方式来完成所有内容的分类？

谢谢，
保罗

Answer 1

对列进行子集并将其包装在unique。

中

unique(d[,c("site","year","mean.ann.temp","mn.ann.precip")])

如果最后两列不同，并且您想要第一行，则采用类似的方式：

d[!duplicated(d[,c("site","year")]),]

Answer 2

使用plyr

计算摘要

require(plyr)
ddply(yourDF, .(site,year), summarize, 
  meanTemp=mean(mean.ann.temp),
  meanPrec=mean(mn.ann.precip)
)

在R中的站点和年份内获取方法

2 个答案: