R计算具有两个分类变量的双向表的比例并添加新变量

时间:2017-07-08 10:02:24

标签: r

我是R的新手。我无法找到以下问题的答案:如何轻松计算两个分类变量的双向表的比例,并将该值添加为新变量?我想使用dplyr和mutate。

Gender <- c("Female","Female","Male","Male") 
Believer <- c("Yes","No","Yes","No")
Count <- c(100,50,200,150)
dat <- data.frame(Gender,Believer,Count)

dat
Gender Believer Count
1 Female      Yes   100
2 Female       No    50
3   Male      Yes   200
4   Male       No   150

str(dat)
'data.frame':   4 obs. of  3 variables:
 $ Gender  : Factor w/ 2 levels "Female","Male": 1 1 2 2
 $ Believer: Factor w/ 2 levels "No","Yes": 2 1 2 1
 $ Count   : num  100 50 200 150

我想得到如下结果:

dat
Gender Believer Count   Prop
1 Female      Yes   100  0.02
2 Female       No    50  0.01
3   Male      Yes   200  0.04
4   Male       No   150  0.03

我非常感谢答案。我肯定很简单,但我找不到它。非常感谢。

1 个答案:

答案 0 :(得分:0)

使用dplyr作为OP建议

library(dplyr)

dat %>% mutate(Prop = Count / sum(Count))

  Gender Believer Count Prop
1 Female      Yes   100  0.2
2 Female       No    50  0.1
3   Male      Yes   200  0.4
4   Male       No   150  0.3

(即使我不知道你是怎么得到0.02的,在这种情况下只有(sum(Count) * 10)):

dat %>% mutate(Prop = Count / (sum(Count) * 10))