你能改变ggplot2中刻面堆积条形图中因子的顺序吗?

时间:2016-06-07 22:48:52

标签: r plot ggplot2

我正在尝试制作一个图表,显示不同年龄组中有18岁以下孩子的男性和女性的百分比。我想要一个有两个酒吧的图表(一个用于男性,一个用于女性)每个年龄组并排;我希望两个栏显示底部有孩子的百分比,而不是顶部(堆积的栏)。我无法弄清楚如何在ggplot2中制作这样的图表,并且非常感谢建议。

我使用dplyr计算了我的分组统计数据:

kid18summary <- marsub %>% 
group_by(AgeGroup, sex, kid_under_18) %>% 
summarise(n=n()) %>% 
mutate(freq = n/sum(n))

产生了这个:

dput(kid18summary)
structure(list(AgeGroup = c("Age<40", "Age<40", "Age<40", "Age<40", 
"Age41-49", "Age41-49", "Age41-49", "Age41-49", "Age50-64", "Age50-64", 
"Age50-64", "Age50-64"), sex = structure(c(1L, 1L, 2L, 2L, 1L, 
1L, 2L, 2L, 1L, 1L, 2L, 2L), .Label = c("Male", "Female"), class = "factor"), 
    kid_under_18 = c("No", "Yes", "No", "Yes", "No", "Yes", "No", 
    "Yes", "No", "Yes", "No", "Yes"), freq = c(0.625, 0.375, 
    0.636833046471601, 0.363166953528399, 0.349557522123894, 
    0.650442477876106, 0.444897959183673, 0.555102040816327, 
    0.724852071005917, 0.275147928994083, 0.819548872180451, 
    0.180451127819549)), .Names = c("AgeGroup", "sex", "kid_under_18", 
"freq"), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -12L), vars = list(AgeGroup, sex), drop = TRUE, indices = list(
    0:1, 2:3, 4:5, 6:7, 8:9, 10:11), group_sizes = c(2L, 2L, 
2L, 2L, 2L, 2L), biggest_group_size = 2L, labels = structure(list(
    AgeGroup = c("Age<40", "Age<40", "Age41-49", "Age41-49", 
    "Age50-64", "Age50-64"), sex = structure(c(1L, 2L, 1L, 2L, 
    1L, 2L), .Label = c("Male", "Female"), class = "factor")), class = "data.frame", row.names = c(NA, 
-6L), vars = list(AgeGroup, sex), drop = TRUE, .Names = c("AgeGroup", 
"sex")))

我可以绘制每个年龄组和没有18岁以下孩子的性别比例:

ggplot(kid18summary, aes(x = factor(AgeGroup), y = freq, fill = factor(sex)), color = factor(sex)) +
  geom_bar(position = "dodge", stat = "identity") + scale_y_continuous(labels = percent)

或者我可以制作一个更接近我想要的刻面叠加条形图,因为我想同时显示“是”和“否”,即使百分比加起来也是如此100因为我认为比较负面空间更容易比较彩色条。唯一的麻烦是无论我做什么,底部都是“No”,顶部是“Yes”,我反过来也喜欢它。 (理想情况下,我真的希望男女不同的颜色,对于有孩子的男人来说是深蓝色,对于没有男人的人来说是淡蓝色;对于有孩子的女人来说是暗红色,对于没有女人的女人是浅色的,但我已经放弃了那暂时。)

我试图以各种方式改变因素的顺序,都完全不成功。

正如ggplot2 documentation中所述,我尝试直接更改因子级别的顺序:

kid18summary$kid_under_18 < as.factor(kid18summary$kid_under_18)
o <- c("Yes", "No")  # which I've also changed to ("No", "Yes"), which makes no difference; the order of the Yes and No in the legend changes, but the "Yes" bars stay on top
kid18summary$kid_under_18 <- factor(kid18summary$kid_under_18, levels = o)

kid18summary $ kid_under_18&lt; - factor(kid18summary $ kid_under_18,levels(kid18summary $ kid_under_18)[c(“是”,“否”)])#更改为[c(“否”,“是”)]也仅更改图例的顺序

我已尝试在另一个问题中提出的答案,并添加了另一个有序因素:

kid18summary <- transform(kid18summary, stack.ord = factor(kid_under_18, levels = c("Yes", "No"), ordered = TRUE))
ggplot(kid18summary, aes(x = factor(sex), y = freq, fill = factor(stack.ord)), color = factor(stack.ord)) + geom_bar(stat = "identity") + scale_y_continuous(labels = percent) + facet_wrap(~AgeGroup, nrow=1)

或者只是添加另一个虚拟变量:

kid18summary$orderfactor <- "NA"
kid18summary$orderfactor[kid18summary$kid_under_18 == "Yes"] <- 0
kid18summary$orderfactor[kid18summary$kid_under_18 == "No"] <- 1
ggplot(kid18summary, aes(x = factor(sex), y = freq, fill = factor(orderfactor)), color = factor(orderfactor)) + geom_bar(stat = "identity") + scale_y_continuous(labels = percent) + facet_wrap(~AgeGroup, nrow=1)

所有这些都给了我很多不同的方法,我可以切换条形中的是和否组的颜色,但实际上不是哪一组在顶部。 Plot1 Plot2

1 个答案:

答案 0 :(得分:1)

根据aosmith建议的答案,我最终得到了以下内容,这正是我想要的:

ggplot(arrange(df, kid_under_18), aes(x = factor(sex), y = freq, fill = interaction(sex, factor(kid_under_18))), color = factor(kid_under_18)) + 
geom_bar(stat = "identity") + scale_y_continuous(labels = percent) + 
facet_wrap(~AgeGroup, nrow=1)