我想计算所有年龄段m
的平均值和所有年龄段f
的平均值,然后从原始值中减去它。
data <- data.frame(height=c(96,72,100,45),age=c(1,2,1,2),sex=c("m","f","f","m"))
data
height age sex
1 96 1 m
2 72 2 f
3 100 1 f
4 45 2 m
期望的输出:
data
height age sex mean dif
1 96 1 m 70.5 25.5
2 72 2 f 86 -14
3 100 1 f 86 14
4 45 2 m 70.5 -25.2
答案 0 :(得分:2)
在dplyr
中使用分组非常简单:
library(dplyr)
data %>%
group_by(sex) %>%
mutate(mean = mean(height),
dif = height - mean)
Source: local data frame [4 x 5] Groups: sex [2] height age sex mean dif <dbl> <dbl> <fctr> <dbl> <dbl> 1 96 1 m 70.5 25.5 2 72 2 f 86.0 -14.0 3 100 1 f 86.0 14.0 4 45 2 m 70.5 -25.5