我正在尝试根据participant_number
拆分数据框,然后计算特定列Happiness
和Joy
(不包括列Lolz
)的宏均值。为什么取列的平均值意味着:
Warning messages:
1: In mean.default(function (x, na.rm = FALSE, dims = 1L) :
argument is not numeric or logical: returning NA
2: In mean.default(function (x, na.rm = FALSE, dims = 1L) :
argument is not numeric or logical: returning NA
我的代码:
library(dplyr)
df<-data.frame(participant_number=c(1,1,1,2,2),Happiness=c(3,4,2,1,3),Joy=c(1,2,3,5,4),Lolz=c(3,3,3,3,3))
df%>%group_by(participant_number)%>%
select(Happiness,Joy)%>%
mutate(emoMean=mean(colMeans))
> df
participant_number Happiness Joy Lolz
1 1 3 1 3
2 1 4 2 3
3 1 2 3 3
4 2 1 5 3
5 2 3 4 3
目标
emoMean
participant_number ... emoMean
1 2.5 (3+1+4+2+2+3)/6 #Note that this value does not include participant_number
1 2.5
1 2.5
2 6.5
2 6.5
注意:
我试图将this作为潜在的解决方案,但完全丢失了
答案 0 :(得分:2)
对于您的具体情况,您可以将两列相加,取均值然后除以2,因为两列总是具有相同的数:
df %>% group_by(participant_number) %>% mutate(emoMean = mean(Happiness + Joy)/2)
Source: local data frame [5 x 5]
Groups: participant_number [2]
participant_number Happiness Joy Lolz emoMean
<dbl> <dbl> <dbl> <dbl> <dbl>
1 1 3 1 3 2.50
2 1 4 2 3 2.50
3 1 2 3 3 2.50
4 2 1 5 3 3.25
5 2 3 4 3 3.25
注意:同时,根据您对第一组平均值的定义,我认为对于第二组,它应该是3.25而不是6.5。
答案 1 :(得分:1)
plyr的替代方案:
df<data.frame(participant_number=c(1,1,1,2,2),Happiness=c(3,4,2,1,3),Joy=c(1,2,3,5,4),Lolz=c(3,3,3,3,3))
df$mean <- ave(apply(df[,2:3],1,mean, na.rm=TRUE), df$participant_number )
答案 2 :(得分:1)
我们可以使用data.table
library(data.table)
setDT(df)[, emoMean := mean(Happiness + Joy)/2 , by = participant_number]
如果有sum
列有多列,则有一个选项是Reduce
nm1 <- names(df)[2:3]
setDT(df)[, emoMean := Reduce(`+`, .SD)/length(nm1),
by = participant_number, .SDcols = nm1]