Question

我正在使用GSS数据，并且具有人口统计信息（种族，教育程度，收入等），并且回答了不同的政治问题（答案是“是”，“否”或“不知道”）。我最终希望开发一个应用程序，使我能够绘制所有这些不同的图，从而使我能够根据人口统计数据比较观点差异，但是这还很遥远。现在，我只是想弄清楚如何有效地绘制堆积的条形图。我首先尝试将收入与堕胎问题进行比较，这就是我所做的：我从一个名为GSSabortionincome的数据集开始，该数据集在大多数情况下看起来是这样的：

`Gss year for t~ `Race of respon~ `Respondents se~ `Total family in~ `Abortion`
             <dbl> <chr>            <chr>            <chr>            <chr>           
1             2018 White            Female           $30000 to 39999  Yes              
2             2018 Black            Male             $40000 to 49999  No

首先，我消除了所有不相关的信息，并计算了关于“堕胎”和收入的答案的总数，然后找到了按收入列出的答案的百分比。我有这个（数据框称为GSSabortionincomepercent）：

`Total family income` `Abortion if woman wants for any reason`     n percent
  <chr>                 <chr>                                    <int>   <dbl>
1 $10000 to 19999       Don't know                                   1    0.01
2 $10000 to 19999       No                                          87    0.59

绘制所有这些信息非常简单：

GSSabortionincomepercent %>%
  ggplot(mapping = aes(x = `Total family income`,
                       y = percent,
                       fill = `Abortion if woman wants for any reason`)) +
  geom_bar(stat = "identity") +
  geom_text(aes(label=percent), position = position_stack(vjust = 0.5),
            color="white", size=3.5)+
  coord_flip()

但是，如果我想比较收入，政治见解和其他人口统计数据（例如种族），则无法使用GSSabortionincomepercent做，因为我抛弃了所有当时不相关的信息。意识到这一点后，我尝试返回以保留其他人口统计信息。我有这个：

GSS2018anyincomecount<- GSS2018anyincome %>%
  group_by(`Total family income`) %>%
  count(`Abortion if woman wants for any reason`)
GSS2018anyincomepercent <- GSS2018anyincomecount %>%
    group_by(`Total family income`) %>%
    mutate(percent = round(n/sum(n), digits = 2))

GSS2018anyincomefinal <- left_join(GSS2018anyincome,
                                   GSS2018anyincomepercent,
                                   by = c("Total family income","Abortion if woman wants for any reason"))
 `Gss year for t~ `Race of respon~ `Respondents se~ `Total family in~ `Abortion`  n    percent
             <dbl> <chr>            <chr>            <chr>            <chr>      <int>  <dbl>  
1             2018 White            Female           $30000 to 39999  Yes         3      0.51
2             2018 Black            Male             $40000 to 49999  No          5      0.47

这真是一团糟，部分是因为我只拿了n和百分比作为家庭总收入和堕胎费，但似乎不适用于所有其他人口统计信息。也因为它只是将ggplot中的所有单个百分比加在一起：

这显然不是我想要的结果。我最终想与其他人口统计信息一起讨论，但是如果不确定这些百分比，我将无法解决这一问题。我知道必须有更好的方法来做到这一点。有人知道吗？

如何在ggplot中有效地从GSS绘制政治问题的百分比？

0 个答案: