所以我制作了一个人的姓名,年龄和他们喜欢的电影的数据框。我想写一个作用于数据框的程序,以给我每个人的特定喜爱电影的平均年龄。这就是我所拥有的。
persons <- list(firstName = c("Steve","Bob","Bill","Chris","Matt","Evan"), lastName = c("Williams","Barker","Barker","Williams","Stevenson","Parker"), age = c(22,30,41,14,9,93), favoriteMovie = c("Alien","The Shining","The Shining","Halloween","Alien","Alien"))
d1 <- data.frame(persons$firstName,persons$lastName,persons$age,persons$favoriteMovie)
d1
persons.firstName persons.lastName persons.age persons.favoriteMovie
1 Steve Williams 22 Alien
2 Bob Barker 30 The Shining
3 Bill Barker 41 The Shining
4 Chris Williams 14 Halloween
5 Matt Stevenson 9 Alien
6 Evan Parker 93 Alien
所以我可以使用if语句循环来完成它,但我不认为这是最有效的方法。我确定有某种单一价值的方式,但我真的不确定。
答案 0 :(得分:3)
您可以尝试使用tapply
> with(d1, tapply(persons.age, persons.favoriteMovie, mean))
Alien Halloween The Shining
41.33333 14.00000 35.50000
您想要查看this answer
答案 1 :(得分:2)
您可以使用by()
:
by(d1$persons.age, d1$persons.favoriteMovie, mean)
d1$persons.favoriteMovie: Alien
[1] 41.33333
-------------------------------------------------------------------------------------------------------------
d1$persons.favoriteMovie: Halloween
[1] 14
-------------------------------------------------------------------------------------------------------------
d1$persons.favoriteMovie: The Shining
[1] 35.5
答案 2 :(得分:1)
包含doBy
功能的包summaryBy
可以为您提供帮助。
library(doBy)
summaryBy(persons.age~persons.favoriteMovie, data=d1, FUN=c(mean))
#persons.favoriteMovie persons.age.mean
#1 Alien 41.33333
#2 Halloween 14.00000
#3 The Shining 35.50000
或者您可以使用dplyr
。
library(dplyr)
grouped <- group_by(d1, persons.favoriteMovie)
summarise(grouped, mean=mean(persons.age))
# persons.favoriteMovie mean
# (fctr) (dbl)
#1 Alien 41.33333
#2 Halloween 14.00000
#3 The Shining 35.50000
答案 3 :(得分:1)
我们可以使用data.table
library(data.table)
setDT(d1)[,.(persons.age = mean(persons.age)) , persons.favoriteMovie]
# persons.favoriteMovie persons.age
#1: Alien 41.33333
#2: The Shining 35.50000
#3: Halloween 14.00000