我已经通过使用三列完成了一个分组,现在我想为每个分组获取百分比:
GENDER preference option num_users
<fct> <fct> <int> <int>
1 Female Sweet 1 136
2 Female Sweet 2 28
3 Female Don't Know 1 18
4 Female Don't Know 2 2
5 Female Medium Spicy 1 62
6 Female Medium Spicy 2 6
7 Female Spicy 1 84
8 Female Spicy 2 20
9 Female Hot 1 35
10 Female Hot 2 5
# ... with 17 more rows
当我仅使用transmute
和Gender
进行分组,但是添加了Option
preference
后,我设法使用了transmute
。
这是我在以下方法中没有preference
列的方法:
grouped_df <- df %>%
group_by(GENDER, option) %>%
summarize(num_users = n()) %>%
spread(GENDER, num_users) %>%
ungroup() %>%
transmute(option_id = option,
female_percent = Female/(Female + Male),
male_percent = Male / (Female + Male)) %>%
mutate(female_percent = round(100 * female_percent),
male_percent = round(100 * male_percent))
如何在上述方法中使用首选项?
答案 0 :(得分:1)
我认为这就是您想要的,我不得不从头开始创建一个数据框,因为您没有提供这样做的代码,但是想法是相同的。
set.seed(1)
df <- data.frame(gender = rep(c("Male", "Female"), each = 10),
preference = rep(c("Sweet", "Don't Know", "Medium Spicy", "Spicy", "Hot"), 2),
option = rep(c(1, 2), 2),
num_users = sample(1:150, 20))
df %>%
group_by(gender, option) %>%
mutate(perc = prop.table(num_users) * 100) %>%
select(-num_users) %>%
spread(preference, perc)
# A tibble: 4 x 7
# Groups: gender, option [4]
gender option `Don't Know` Hot `Medium Spicy` Spicy Sweet
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Female 1 22.8 24.7 33.6 12 6.82
2 Female 2 6.58 26.8 34.7 13.9 17.9
3 Male 1 35.9 7.85 22.3 23.6 10.5
4 Male 2 13.2 2.12 22.4 31.5 30.8