我想在R中绘制一个年龄金字塔,类似于Population pyramid plot with ggplot2 and dplyr (instead of plyr)。
问题是我的数据已经由子组聚合。因此,我不想计算65岁时的出现次数,而是65岁时所有数字的总和。
例如:
apply_async()
我应该如何更改此代码:
df = structure(list(number = c(26778, 28388, 23491, 18602, 15787,
24536), gender = c("F", "M", "F", "M", "F", "M"), age = c(65,
65, 65, 65, 74, 58)), .Names = c("number", "gender", "age"), row.names = c(142L,
234L, 243L, 252L, 298L, 356L), class = "data.frame")
答案 0 :(得分:2)
您可以预先汇总数据,然后将其传递到ggplot,如下所示:
df1 <- df %>% group_by(gender,age) %>% summarise(s_age = sum(age))
ggplot(data = df1, aes(x = age,y=s_age, fill = gender)) +
geom_bar(data = filter(df1, gender == "F"), stat = "identity" ) +
geom_bar(data = filter(df1, gender == "M"), stat="identity", aes(y=-s_age) ) +
coord_flip()