我是R的新手。我无法找到以下问题的答案:如何轻松计算两个分类变量的双向表的比例,并将该值添加为新变量?我想使用dplyr和mutate。
Gender <- c("Female","Female","Male","Male")
Believer <- c("Yes","No","Yes","No")
Count <- c(100,50,200,150)
dat <- data.frame(Gender,Believer,Count)
dat
Gender Believer Count
1 Female Yes 100
2 Female No 50
3 Male Yes 200
4 Male No 150
str(dat)
'data.frame': 4 obs. of 3 variables:
$ Gender : Factor w/ 2 levels "Female","Male": 1 1 2 2
$ Believer: Factor w/ 2 levels "No","Yes": 2 1 2 1
$ Count : num 100 50 200 150
我想得到如下结果:
dat
Gender Believer Count Prop
1 Female Yes 100 0.02
2 Female No 50 0.01
3 Male Yes 200 0.04
4 Male No 150 0.03
我非常感谢答案。我肯定很简单,但我找不到它。非常感谢。
答案 0 :(得分:0)
使用dplyr
作为OP建议
library(dplyr)
dat %>% mutate(Prop = Count / sum(Count))
Gender Believer Count Prop
1 Female Yes 100 0.2
2 Female No 50 0.1
3 Male Yes 200 0.4
4 Male No 150 0.3
(即使我不知道你是怎么得到0.02的,在这种情况下只有(sum(Count) * 10)
):
dat %>% mutate(Prop = Count / (sum(Count) * 10))