多列上的ddply内联函数

时间:2014-05-31 17:17:29

标签: r plyr inline-functions

如何将矢量/列表传递给plyer:ddply内联函数?此代码有效:

newdf <-ddply(olddf, .(V1, V2), function(df)
                    c( mean(df$V3), +
                       mean(df$V4), +
                       mean(df$V5), +
                       mean(df$V6), +
                       mean(df$V7), +
                       mean(df$V8), +
                       mean(df$V9), +
                       mean(df$V10), +
                       mean(df$V11), +
                       mean(df$V12), +
                       mean(df$V13), +
                       mean(df$V14), +
                       mean(df$V15), +
                       mean(df$V16), +
                       mean(df$V17), +
                       mean(df$V18), +
                       mean(df$V19), +
                       mean(df$V20) 
                     ) 
               )

但是我想做这样的事情(抛出错误,警告):

newdf <-ddply( olddf, .(V1, V2), function(df)  lapply(df[,3:20], mean) )

Error in list_to_dataframe(res, attr(.data, "split_labels"), .id, id_as_factor) : 
  Results must be all atomic, or all data frames
In addition: There were 50 or more warnings (use warnings() to see the first 50)

感谢您的建议。

1 个答案:

答案 0 :(得分:4)

您想要sapply而不是lapply

ddply(olddf, .(V1, V2), function(df) sapply(df[,3:20], mean) )

lapply会返回一个list,正如错误所说,它不是原子的,而sapply会尝试简化结果 - 在你的情况下是一个数字向量,第一次尝试返回的类型。

但更好的例子是colwise

ddply(olddf, .(V1, V2), colwise(mean))