ddply如何拆分数据?

时间:2019-04-16 21:59:50

标签: r plyr

我有这个数据框。

mydf<- data.frame(c("a","a","b","b","c","c"),c("e","e","e","e","e","e")
                  ,c(1,2,3,10,20,30),
                  c(5,10,20,20,15,10))
colnames(mydf)<-c("Model", "Class","Length", "Speed")

我试图更好地了解ddply的工作原理。

我想获得每对模型和类的平均长度和速度。

我知道这是一种实现方法:ddply(mydf, .(Model, Class), .fun = summarize, mSpeed = mean(Speed), mLength = mean(Length))

我想知道是否可以使用ddply来获得均值,而不必一次指定一个。

我尝试了ddply(mydf, .(Model, Class), .fun = mean),但收到错误消息

  

警告消息:1:在mean.default(piece,...)中:参数不是   数字或逻辑:返回NA

ddply传递给函数参数什么?有没有一种方法可以使用ddply将一个函数应用于每一列?

我的目标是进一步了解ddply。我只会接受ddply

的答案

1 个答案:

答案 0 :(得分:0)

Here's a solution using dplyr and the summarize function.



library(dplyr)


mydf<- data.frame(c("a","a","b","b","c","c"),c("e","e","e","e","e","e")
                  ,c(1,2,3,10,20,30),
                  c(5,10,20,20,15,10))
colnames(mydf)<-c("Model", "Class","Length", "Speed")

#summarize data by Model & Class
mydf %>%  group_by(Model, Class) %>% summarize_if(is.numeric, mean)


#> # A tibble: 3 x 4
#> # Groups:   Model [3]
#>   Model Class Length Speed
#>   <fct> <fct>  <dbl> <dbl>
#> 1 a     e        1.5   7.5
#> 2 b     e        6.5  20  
#> 3 c     e       25    12.5

Created on 2019-04-16 by the reprex package (v0.2.1)