是否可以在单个tapply或聚合语句中包含两个函数?
下面我使用两个tapply语句和两个聚合语句:一个用于均值,一个用于SD 我更愿意结合这些陈述。
my.Data = read.table(text = "
animal age sex weight
1 adult female 100
2 young male 75
3 adult male 90
4 adult female 95
5 young female 80
", sep = "", header = TRUE)
with(my.Data, tapply(weight, list(age, sex), function(x) {mean(x)}))
with(my.Data, tapply(weight, list(age, sex), function(x) {sd(x) }))
with(my.Data, aggregate(weight ~ age + sex, FUN = mean)
with(my.Data, aggregate(weight ~ age + sex, FUN = sd)
# this does not work:
with(my.Data, tapply(weight, list(age, sex), function(x) {mean(x) ; sd(x)}))
# I would also prefer that the output be formatted something similar to that
# show below. `aggregate` formats the output perfectly. I just cannot figure
# out how to implement two functions in one statement.
age sex mean sd
adult female 97.5 3.535534
adult male 90 NA
young female 80.0 NA
young male 75 NA
我总是可以运行两个单独的语句并合并输出。我只是希望有可能 一个稍微方便的解决方案。
我在下面发现了以下答案:Apply multiple functions to column using tapply
f <- function(x) c(mean(x), sd(x))
do.call( rbind, with(my.Data, tapply(weight, list(age, sex), f)) )
但是,行或列都没有标记。
[,1] [,2]
[1,] 97.5 3.535534
[2,] 80.0 NA
[3,] 90.0 NA
[4,] 75.0 NA
我更喜欢基础R中的解决方案。来自plyr
包的解决方案已发布在上面的链接中。如果我可以将正确的行和列标题添加到上面的输出中,那将是完美的。
答案 0 :(得分:16)
但这些应该有:
with(my.Data, aggregate(weight, list(age, sex), function(x) { c(MEAN=mean(x), SD=sd(x) )}))
with(my.Data, tapply(weight, list(age, sex), function(x) { c(mean(x) , sd(x) )} ))
# Not a nice structure but the results are in there
with(my.Data, aggregate(weight ~ age + sex, FUN = function(x) c( SD = sd(x), MN= mean(x) ) ) )
age sex weight.SD weight.MN
1 adult female 3.535534 97.500000
2 young female NA 80.000000
3 adult male NA 90.000000
4 young male NA 75.
要遵循的原则是让你的函数返回“one thing”,它可以是vector或list,但不能连续调用两个函数调用。
答案 1 :(得分:9)
如果您想使用data.table,则会在其中内置with
和by
:
library(data.table)
myDT <- data.table(my.Data, key="animal")
myDT[, c("mean", "sd") := list(mean(weight), sd(weight)), by=list(age, sex)]
myDT[, list(mean_Aggr=sum(mean(weight)), sd_Aggr=sum(sd(weight))), by=list(age, sex)]
age sex mean_Aggr sd_Aggr
1: adult female 96.0 3.6055513
2: young male 76.5 2.1213203
3: adult male 91.0 1.4142136
4: young female 84.5 0.7071068
我使用了稍微不同的数据集,以便没有{s}的NA
值
答案 2 :(得分:7)
本着分享的精神,如果你熟悉SQL ,你也可以考虑使用“sqldf”包。 (例如,添加强调是因为您需要知道mean
是avg
才能获得所需的结果。)
sqldf("select age, sex,
avg(weight) `Wt.Mean`,
stdev(weight) `Wt.SD`
from `my.Data`
group by age, sex")
age sex Wt.Mean Wt.SD
1 adult female 97.5 3.535534
2 adult male 90.0 0.000000
3 young female 80.0 0.000000
4 young male 75.0 0.000000
答案 3 :(得分:5)
重塑可让您传递2个功能; reshape2没有。
library(reshape)
my.Data = read.table(text = "
animal age sex weight
1 adult female 100
2 young male 75
3 adult male 90
4 adult female 95
5 young female 80
", sep = "", header = TRUE)
my.Data[,1]<- NULL
(a1<- melt(my.Data, id=c("age", "sex"), measured=c("weight")))
(cast(a1, age + sex ~ variable, c(mean, sd), fill=NA))
# age sex weight_mean weight_sd
# 1 adult female 97.5 3.535534
# 2 adult male 90.0 NA
# 3 young female 80.0 NA
# 4 young male 75.0 NA
我欠@Ramnath,他昨天注意到this。