Question

我正在尝试制作一个图表，显示不同年龄组中有18岁以下孩子的男性和女性的百分比。我想要一个有两个酒吧的图表（一个用于男性，一个用于女性）每个年龄组并排;我希望两个栏显示底部有孩子的百分比，而不是顶部（堆积的栏）。我无法弄清楚如何在ggplot2中制作这样的图表，并且非常感谢建议。

我使用dplyr计算了我的分组统计数据：

kid18summary <- marsub %>% 
group_by(AgeGroup, sex, kid_under_18) %>% 
summarise(n=n()) %>% 
mutate(freq = n/sum(n))

产生了这个：

dput(kid18summary)
structure(list(AgeGroup = c("Age<40", "Age<40", "Age<40", "Age<40", 
"Age41-49", "Age41-49", "Age41-49", "Age41-49", "Age50-64", "Age50-64", 
"Age50-64", "Age50-64"), sex = structure(c(1L, 1L, 2L, 2L, 1L, 
1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("Male", "Female"), class = "factor"), 
    kid_under_18 = c("No", "Yes", "No", "Yes", "No", "Yes", "No", 
    "Yes", "No", "Yes", "No", "Yes"), freq = c(0.625, 0.375, 
    0.636833046471601, 0.363166953528399, 0.349557522123894, 
    0.650442477876106, 0.444897959183673, 0.555102040816327, 
    0.724852071005917, 0.275147928994083, 0.819548872180451, 
    0.180451127819549)), .Names = c("AgeGroup", "sex", "kid_under_18", 
"freq"), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -12L), vars = list(AgeGroup, sex), drop = TRUE, indices = list(
    0:1, 2:3, 4:5, 6:7, 8:9, 10:11), group_sizes = c(2L, 2L, 
2L, 2L, 2L, 2L), biggest_group_size = 2L, labels = structure(list(
    AgeGroup = c("Age<40", "Age<40", "Age41-49", "Age41-49", 
    "Age50-64", "Age50-64"), sex = structure(c(1L, 2L, 1L, 2L, 
    1L, 2L), .Label = c("Male", "Female"), class = "factor")), class = "data.frame", row.names = c(NA, 
-6L), vars = list(AgeGroup, sex), drop = TRUE, .Names = c("AgeGroup", 
"sex")))

我可以绘制每个年龄组和没有18岁以下孩子的性别比例：

ggplot(kid18summary, aes(x = factor(AgeGroup), y = freq, fill = factor(sex)), color = factor(sex)) +
  geom_bar(position = "dodge", stat = "identity") + scale_y_continuous(labels = percent)

或者我可以制作一个更接近我想要的刻面叠加条形图，因为我想同时显示“是”和“否”，即使百分比加起来也是如此100因为我认为比较负面空间更容易比较彩色条。唯一的麻烦是无论我做什么，底部都是“No”，顶部是“Yes”，我反过来也喜欢它。（理想情况下，我真的希望男女不同的颜色，对于有孩子的男人来说是深蓝色，对于没有男人的人来说是淡蓝色;对于有孩子的女人来说是暗红色，对于没有女人的女人是浅色的，但我已经放弃了那暂时。）

我试图以各种方式改变因素的顺序，都完全不成功。

正如ggplot2 documentation中所述，我尝试直接更改因子级别的顺序：

kid18summary$kid_under_18 < as.factor(kid18summary$kid_under_18)
o <- c("Yes", "No")  # which I've also changed to ("No", "Yes"), which makes no difference; the order of the Yes and No in the legend changes, but the "Yes" bars stay on top
kid18summary$kid_under_18 <- factor(kid18summary$kid_under_18, levels = o)

kid18summary $ kid_under_18＆lt; - factor（kid18summary $ kid_under_18，levels（kid18summary $ kid_under_18）[c（“是”，“否”）]）＃更改为[c（“否”，“是”）]也仅更改图例的顺序

我已尝试在另一个问题中提出的答案，并添加了另一个有序因素：

kid18summary <- transform(kid18summary, stack.ord = factor(kid_under_18, levels = c("Yes", "No"), ordered = TRUE))
ggplot(kid18summary, aes(x = factor(sex), y = freq, fill = factor(stack.ord)), color = factor(stack.ord)) + geom_bar(stat = "identity") + scale_y_continuous(labels = percent) + facet_wrap(~AgeGroup, nrow=1)

或者只是添加另一个虚拟变量：

kid18summary$orderfactor <- "NA"
kid18summary$orderfactor[kid18summary$kid_under_18 == "Yes"] <- 0
kid18summary$orderfactor[kid18summary$kid_under_18 == "No"] <- 1
ggplot(kid18summary, aes(x = factor(sex), y = freq, fill = factor(orderfactor)), color = factor(orderfactor)) + geom_bar(stat = "identity") + scale_y_continuous(labels = percent) + facet_wrap(~AgeGroup, nrow=1)

所有这些都给了我很多不同的方法，我可以切换条形中的是和否组的颜色，但实际上不是哪一组在顶部。 Plot1

Answer 1

根据aosmith建议的答案，我最终得到了以下内容，这正是我想要的：

ggplot(arrange(df, kid_under_18), aes(x = factor(sex), y = freq, fill = interaction(sex, factor(kid_under_18))), color = factor(kid_under_18)) + 
geom_bar(stat = "identity") + scale_y_continuous(labels = percent) + 
facet_wrap(~AgeGroup, nrow=1)

你能改变ggplot2中刻面堆积条形图中因子的顺序吗？

1 个答案: