为什么SE_daily
的值错了?我期望它舍入到最接近的整数(虽然我想要一个小数),而十进制答案是完全错误的。我错过了什么?
csv<-csv%>%group_by(id_num)%>%group_by(Month)%>%group_by(Day)%>%mutate(SE_daily=mean(SelfEsteem, na.rm=T))
head(csv[,c(1:5,28,181)])
> head(csv[,c(1:5,28,181)])
Source: local data frame [6 x 7]
Groups: Day [3]
X.1 X id_num Month Day SelfEsteem SE_daily
<int> <int> <int> <int> <int> <int> <dbl>
1 1 1 29 2 19 4 3.457944 #mean(4,4,3)= 4, expected answer= 3.66666666667
2 2 2 29 2 19 4 3.457944
3 3 3 29 2 19 3 3.457944
4 4 4 29 2 20 4 3.424242 #expected answer= 4
5 5 5 29 2 21 4 3.318182 #expected answer=4
6 6 6 29 2 21 4 3.318182
csv输出头:
structure(list(X.1 = 1:6, X = 1:6,
id_num = c(29L, 29L, 29L, 29L, 29L, 29L),
Month = c(2L, 2L, 2L, 2L, 2L, 2L),
Day = c(19L, 19L, 19L, 20L, 21L, 21L),
SelfEsteem = c(4L, 4L, 3L, 4L, 4L, 4L),
SE_daily = c(3.45794392523365, 3.45794392523365, 3.45794392523365, 3.42424242424242, 3.31818181818182, 3.31818181818182)),
.Names = c("X.1", "X", "id_num", "Month", "Day", "SelfEsteem", "SE_daily"),
row.names = c(NA, -6L),
class = "data.frame")
答案 0 :(得分:2)
我得到了SE_daily的预期输出。通过管道group_by
命令而不是将它们放在一个命令中,您可能正在查看共享公共id_num
的多个Months
和Day
(假设提供的数据结构只是整个数据集的一个子集)
library(dplyr)
csv %>%
group_by(id_num, Month, Day) %>%
mutate(SE_daily=mean(SelfEsteem, na.rm=TRUE))
输出
Source: local data frame [6 x 7]
Groups: id_num, Month, Day [3]
X.1 X id_num Month Day SelfEsteem SE_daily
<int> <int> <int> <int> <int> <int> <dbl>
1 1 1 29 2 19 4 3.666667
2 2 2 29 2 19 4 3.666667
3 3 3 29 2 19 3 3.666667
4 4 4 29 2 20 4 4.000000
5 5 5 29 2 21 4 4.000000
6 6 6 29 2 21 4 4.000000