在多个数据帧上使用ddply并创建相应的新数据帧

时间:2013-09-11 20:56:18

标签: r aggregate plyr

我有18个数据框,名为ageharmonic1 , ageharmonic2, ageharmonic3,..... , ageharmonic18。 所有数据帧都具有相似的内容和确切的数据。我将展示一个数据帧的头部。

ageharmonic1 <- structure(list(Time = c(129, 129.041687011719, 129.08332824707, 
129.125015258789, 129.166687011719, 129.20832824707), Dye = c(0.99999612569809, 
0.999995410442352, 0.999996840953827, 0.999998211860657, 1.00000166893005, 
0.999999165534973), ageconc = c(583.908142089844, 576.525756835938, 
572.939453125, 572.553527832031, 573.761291503906, 578.520263671875
), id = c("station1", "station1", "station1", "station1", "station1", 
"station1"), dist = c(0, 0, 0, 0, 0, 0), age = c(0.00675822227239628, 
0.00667278244035045, 0.00663126461889936, 0.00662678879212212, 
0.00664074460576439, 0.0066958419725371)), .Names = c("Time", 
"Dye", "ageconc", "id", "dist", "age"), row.names = c(NA, 6L), class = "data.frame")


> head(ageharmonic1)
      Time       Dye  ageconc       id dist         age
1 129.0000 0.9999961 583.9081 station1    0 0.006758222
2 129.0417 0.9999954 576.5258 station1    0 0.006672782
3 129.0833 0.9999968 572.9395 station1    0 0.006631265
4 129.1250 0.9999982 572.5535 station1    0 0.006626789
5 129.1667 1.0000017 573.7613 station1    0 0.006640745
6 129.2083 0.9999992 578.5203 station1    0 0.006695842

我现在要做的是使用plyr包中的ddply函数将数据框与id变量聚合

aggreg1 <- ddply(ageharmonic1, .(id), summarise, meanage=mean(age))

我想对所有数据框使用相同的公式并自动创建数据框aggreg1, aggreg2, aggreg3, .... , aggreg18.

这就是我的尝试:

for (i in 1:18){
  aggreg[i] <- ddply(paste0("ageharmonic",i),.(id),summarise,meanage=mean(age))
}

我在paste0("ageharmonic",i)中的表达式是一个字符,似乎并不代表我正在尝试处理的数据帧。

2 个答案:

答案 0 :(得分:2)

如果您将数据框放在列表中,可以尝试这样做:

# a small example
# create some data frames
df1 <- data.frame(id = rep(1:2, each = 3), age = rnorm(6))
df2 <- data.frame(id = rep(3:4, each = 3), age = rnorm(6))
df3 <- data.frame(id = rep(1:2, each = 3), age = rnorm(6))

# create a list of data frames
mylist <- list(df1, df2, df3)
mylist

# for each element in the list (i.e. a single data frame), apply the function 'aggregate',
# where mean age per id is calculated
# store aggregated results in a new list
mylist2 <- lapply(seq_along(mylist), function(x) aggregate(age ~ id, data = mylist[[x]], mean))
mylist2

答案 1 :(得分:2)

mydata1<-mtcars[1:10,1:2]
mydata2<-mcars[11:20,1:2]
mydata<-list(mydata1,mydata2)
library(plyr)
kk<-Map(function(x) ddply(x,.(cyl),summarize,mpg=mean(mpg)), mydata)

> kk
[[1]]
  cyl      mpg
1   4 23.33333
2   6 20.14000
3   8 16.50000

[[2]]
  cyl      mpg
1   4 32.23333
2   6 17.80000
3   8 14.06667