Question

我已经通过使用三列完成了一个分组，现在我想为每个分组获取百分比：

   GENDER      preference                           option num_users
   <fct>       <fct>                               <int>     <int>
 1 Female      Sweet                                   1       136
 2 Female      Sweet                                   2        28
 3 Female      Don't Know                              1        18
 4 Female      Don't Know                              2         2
 5 Female      Medium Spicy                            1        62
 6 Female      Medium Spicy                            2         6
 7 Female      Spicy                                   1        84
 8 Female      Spicy                                   2        20
 9 Female      Hot                                     1        35
10 Female      Hot                                     2         5
# ... with 17 more rows

当我仅使用transmute和Gender进行分组，但是添加了Option preference后，我设法使用了transmute。

这是我在以下方法中没有preference列的方法：

grouped_df <- df %>% 
group_by(GENDER, option) %>% 
summarize(num_users = n()) %>% 
spread(GENDER, num_users) %>% 
ungroup() %>% 
transmute(option_id = option, 
  female_percent = Female/(Female + Male), 
  male_percent = Male / (Female + Male)) %>% 
mutate(female_percent = round(100 * female_percent), 
  male_percent = round(100 * male_percent))

如何在上述方法中使用首选项？

Answer 1

我认为这就是您想要的，我不得不从头开始创建一个数据框，因为您没有提供这样做的代码，但是想法是相同的。

set.seed（1）

df <- data.frame(gender = rep(c("Male", "Female"), each = 10),
           preference = rep(c("Sweet", "Don't Know", "Medium Spicy", "Spicy", "Hot"), 2),
           option = rep(c(1, 2), 2),
           num_users = sample(1:150, 20))

df %>% 
  group_by(gender, option) %>% 
  mutate(perc = prop.table(num_users) * 100) %>%
  select(-num_users) %>% 
  spread(preference, perc)

# A tibble: 4 x 7
# Groups:   gender, option [4]
  gender option `Don't Know`   Hot `Medium Spicy` Spicy Sweet
  <fct>   <dbl>        <dbl> <dbl>          <dbl> <dbl> <dbl>
1 Female      1        22.8  24.7            33.6  12    6.82
2 Female      2         6.58 26.8            34.7  13.9 17.9 
3 Male        1        35.9   7.85           22.3  23.6 10.5 
4 Male        2        13.2   2.12           22.4  31.5 30.8

R dplyr在多个groupby列后获得的百分比

1 个答案: