我有这个数据框。
mydf<- data.frame(c("a","a","b","b","c","c"),c("e","e","e","e","e","e")
,c(1,2,3,10,20,30),
c(5,10,20,20,15,10))
colnames(mydf)<-c("Model", "Class","Length", "Speed")
我试图更好地了解ddply的工作原理。
我想获得每对模型和类的平均长度和速度。
我知道这是一种实现方法:ddply(mydf, .(Model, Class), .fun = summarize, mSpeed = mean(Speed), mLength = mean(Length))
。
我想知道是否可以使用ddply来获得均值,而不必一次指定一个。
我尝试了ddply(mydf, .(Model, Class), .fun = mean)
,但收到错误消息
警告消息:1:在mean.default(piece,...)中:参数不是 数字或逻辑:返回NA
ddply
传递给函数参数什么?有没有一种方法可以使用ddply
将一个函数应用于每一列?
我的目标是进一步了解ddply
。我只会接受ddply
答案 0 :(得分:0)
Here's a solution using dplyr
and the summarize
function.
library(dplyr)
mydf<- data.frame(c("a","a","b","b","c","c"),c("e","e","e","e","e","e")
,c(1,2,3,10,20,30),
c(5,10,20,20,15,10))
colnames(mydf)<-c("Model", "Class","Length", "Speed")
#summarize data by Model & Class
mydf %>% group_by(Model, Class) %>% summarize_if(is.numeric, mean)
#> # A tibble: 3 x 4
#> # Groups: Model [3]
#> Model Class Length Speed
#> <fct> <fct> <dbl> <dbl>
#> 1 a e 1.5 7.5
#> 2 b e 6.5 20
#> 3 c e 25 12.5
Created on 2019-04-16 by the reprex package (v0.2.1)