do.call和多功能(x)

时间:2013-12-11 13:31:13

标签: r

我目前正在尝试计算物种丰度指数,而且我对do.call命令有点困惑。

我有一个这样的DF:

    YEAR    RN      DATE    NOM         SITE            LONG    SP                                     SUMNB    NB100
1   2011    RNN027  15056   ESTAGNOL    RNN027-Estagnol 02  310 Anthocharis cardamines (Linnaeus, 1758) 1   0.3225806
2   2011    RNN027  15075   ESTAGNOL    RNN027-Estagnol 02  310 Anthocharis cardamines (Linnaeus, 1758) 1   0.3225806
3   2003    RNN027  12166   ESTAGNOL    RNN027-Estagnol 03  330 Anthocharis cardamines (Linnaeus, 1758) 2   0.6060606
4   2006    RNN027  13252   ESTAGNOL    RNN027-Estagnol 03  330 Anthocharis cardamines (Linnaeus, 1758) 2   0.6060606
5   2006    RNN027  13257   ESTAGNOL    RNN027-Estagnol 03  330 Anthocharis cardamines (Linnaeus, 1758) 2   0.6060606
6   2005    RNN027  12895   ESTAGNOL    RNN027-Estagnol 01  540 Anthocharis cardamines (Linnaeus, 1758) 2   0.3703704
7   2005    RNN027  12910   ESTAGNOL    RNN027-Estagnol 01  540 Anthocharis cardamines (Linnaeus, 1758) 2   0.3703704

为了计算我的索引,我必须隔离每个SITE / YEAR组合并记录第一个和最后一个日期(减去加7天)。

我应该可以使用以下命令执行此操作(因为完成而无法工作):

do.call(rbind, by(DF, DF[c("YEAR","SITE")], FUN = function(x) {
  tmp <- x[c(1, nrow(x)), ]
  tmpmin<-min(tmp$DATE)
  tmpmax<-max(tmp$DATE)
  tmp1<-tmp1-7
  tmp2<-tmp2+7
  return(tmp)

但我不知道如何完成我的命令以适应我想要的:我需要保留修改日期,并在每个SITE / YEAR / SP组合之前和之后分别添加它们。关键是要检测每个站点的第一个和最后一个观察日期的所有物种的总和,根据需要修改它们,并将它们添加到我得到的每个物种的时间重新分区中(每个“块”中有两个新行)。

我可以使用以下代码在SP标准的每个“块”之前和之后添加一行(但该行现在基于以及第一个和最后一个日期,而不是我想要的日期):< / p>

do.call(rbind, by(DF, DF[c("YEAR","SITE", "SP")], FUN = function(x) {
  tmp <- x[c(1, seq(nrow(x)), nrow(x)), ]
  tmp$DATE[1] <- tmp$DATE[1] - 7
  tmp$DATE[nrow(tmp)] <- tmp$DATE[nrow(tmp)] + 7
  return(tmp)
}))

我的问题是,如何将这两个命令链接到第二个命令(SITE / YEAR / SP)成功添加包含第一个命令(SITE / YEAR)日期的行。我试图在我的function(x)命令中添加一个循环,以及另一个do.call命令,但是它没有工作。

编辑:

@Troy:昨天,我在do.call命令中成功添加了一个循环:我的目标是将每个SITE / YEAR组合子集,无论种类如何。在每个子集中,我采用物种总和的时间分布的两个极限(因为我没有那个信息)。然后我为每个物种写了一行,我收集了子集中的信息。 我的循环在这里写了一个包含N行的新数据帧,用于具有最小和最大日期的N种(见下文)。我将进一步将这个虚拟数据帧与我的实际DF合并。

MIN<-data.frame(matrix(NA, nrow = 100, ncol = 9))
colnames(MIN)<-c("YEAR","RN","DATE","NOM","SITE","LONG","SP","SUMNB","NB100" )
MAX<-data.frame(matrix(NA, nrow = 100, ncol = 9))
colnames(MAX)<-c("YEAR","RN","DATE","NOM","SITE","LONG","SP","SUMNB","NB100" )
head(do.call(rbind, by(AGG100, AGG100[c("YEAR","SITE")], FUN = function(x) {

  splist<-unique(x$SP)
  lsp<-length(splist)
  for (i in 1:lsp){
  MIN$SP[i]<-as.character(splist[i])
  MIN$SITE[i]<-as.character(unique(x$SITE))
  MIN$DATE[i]<-as.character(min(x$DATE) - 7)
  MIN$RN[i]<-as.character(unique(x$RN))
  MIN$YEAR[i]<-as.character(unique(x$YEAR))
  MIN$NOM[i]<-as.character(unique(x$NOM))
  MIN$LONG[i]<-as.numeric(unique(x$LONG))
  MIN$SUMNB[i]<-0
  MIN$NB100[i]<-0
  MAX$SP[i]<-as.character(splist[i])
  MAX$SITE[i]<-as.character(unique(x$SITE))
  MAX$DATE[i]<-as.character(min(x$DATE) + 7)
  MAX$RN[i]<-as.character(unique(x$RN))
  MAX$YEAR[i]<-as.character(unique(x$YEAR))
  MAX$NOM[i]<-as.character(unique(x$NOM))
  MAX$LONG[i]<-as.numeric(unique(x$LONG))
  MAX$SUMNB[i]<-0
  MAX$NB100[i]<-0

MINMAX<- rbind(MIN,MAX)
MINMAX<-MINMAX[complete.cases(MINMAX),]  

}
return(MINMAX)
})), n=50)

YEAR     RN       DATE      NOM               SITE LONG                                                SP
1   2003 RNN027 2003-04-10 ESTAGNOL RNN027-Estagnol 01  540  Brintesia circe (Fabricius, 1775)
2   2003 RNN027 2003-04-10 ESTAGNOL RNN027-Estagnol 01  540  Carcharodus alceae (Esper, 1780)
3   2003 RNN027 2003-04-10 ESTAGNOL RNN027-Estagnol 01  540  Celastrina argiolus (Linnaeus, 1758)
4   2003 RNN027 2003-04-10 ESTAGNOL RNN027-Estagnol 01  540  Coenonympha dorus (Esper, 1782)
5   2003 RNN027 2003-04-10 ESTAGNOL RNN027-Estagnol 01  540  Coenonympha pamphilus (Linnaeus, 1758)

编辑2:现在正在工作,感谢您的帮助!

1 个答案:

答案 0 :(得分:1)

使用plyr - 如果您的剩余列6-10对于DATE / SITE的任何组合始终相同,那么这可以进一步简化(不需要merge())< / p>

require(plyr)

sp<-read.csv("sp.csv")
sp<-sp[,2:10] #(take out the ID numbers from csv)

mins<-ddply(sp,.(YEAR,SITE,SP),summarise,DATE=min(DATE))
mins<-merge(sp,mins,by=c("YEAR","SITE","DATE"))
mins$DATE<-mins$DATE-7

maxs<-ddply(sp,.(YEAR,SITE,SP),summarise,DATE=max(DATE))
maxs<-merge(sp,maxs,by=c("YEAR","SITE","DATE"))
maxs$DATE<-maxs$DATE+7

sp.new<-rbind(mins,sp,maxs)
sp.new[order(sp.new$DATE),]

   YEAR            SITE  DATE     RN      NOM   LONG                                     SP SUMNB     NB100
1  2003 RNN027-Estagnol 12159 RNN027 ESTAGNOL 03 330 Anthocharis cardamines (Linnaeus,1758)     2 0.6060606
7  2003 RNN027-Estagnol 12166 RNN027 ESTAGNOL 03 330 Anthocharis cardamines (Linnaeus,1758)     2 0.6060606
12 2003 RNN027-Estagnol 12173 RNN027 ESTAGNOL 03 330 Anthocharis cardamines (Linnaeus,1758)     2 0.6060606
2  2005 RNN027-Estagnol 12888 RNN027 ESTAGNOL 01 540 Anthocharis cardamines (Linnaeus,1758)     2 0.3703704
10 2005 RNN027-Estagnol 12895 RNN027 ESTAGNOL 01 540 Anthocharis cardamines (Linnaeus,1758)     2 0.3703704
11 2005 RNN027-Estagnol 12910 RNN027 ESTAGNOL 01 540 Anthocharis cardamines (Linnaeus,1758)     2 0.3703704
13 2005 RNN027-Estagnol 12917 RNN027 ESTAGNOL 01 540 Anthocharis cardamines (Linnaeus,1758)     2 0.3703704
3  2006 RNN027-Estagnol 13245 RNN027 ESTAGNOL 03 330 Anthocharis cardamines (Linnaeus,1758)     2 0.6060606
8  2006 RNN027-Estagnol 13252 RNN027 ESTAGNOL 03 330 Anthocharis cardamines (Linnaeus,1758)     2 0.6060606
9  2006 RNN027-Estagnol 13257 RNN027 ESTAGNOL 03 330 Anthocharis cardamines (Linnaeus,1758)     2 0.6060606
14 2006 RNN027-Estagnol 13264 RNN027 ESTAGNOL 03 330 Anthocharis cardamines (Linnaeus,1758)     2 0.6060606
4  2011 RNN027-Estagnol 15049 RNN027 ESTAGNOL 02 310 Anthocharis cardamines (Linnaeus,1758)     1 0.3225806
5  2011 RNN027-Estagnol 15056 RNN027 ESTAGNOL 02 310 Anthocharis cardamines (Linnaeus,1758)     1 0.3225806
6  2011 RNN027-Estagnol 15075 RNN027 ESTAGNOL 02 310 Anthocharis cardamines (Linnaeus,1758)     1 0.3225806
15 2011 RNN027-Estagnol 15082 RNN027 ESTAGNOL 02 310 Anthocharis cardamines (Linnaeus,1758)     1 0.3225806