Question

我创建了一个条形图来显示越南的人口分布。这是我的vietnam2015数据：

 Year Age.group Est.pop
1  2015       0-4    7753
2  2015       5-9    7233
3  2015     10-14    6623
4  2015     15-19    6982
5  2015     20-24    8817
6  2015     25-29    8674
7  2015     30-34    7947
8  2015     35-39    7166
9  2015     40-44    6653
10 2015     45-49    6011
11 2015     50-54    5469
12 2015     55-59    4623
13 2015     60-64    3310
14 2015     65-69    1896
15 2015     70-74    1375
16 2015     75-79    1162
17 2015       80+    1878

这是我的条形图，我想知道我是否也可以制作点图而不是条形图。

Library(tidyverse)

vietnam2015 %>%
  filter(Age.group != "5-9") %>% # Somehow this weird value creeped into the data frame, is therefor filtered out.
  ggplot(aes(x = Age.group, y = Est.pop)) +
  geom_col(colour = "black",
           fill = "#FFEB3B")

现在我知道点图通常用于没有那么多数据点的数据。但我可以创建一个点图，其中一个点代表1000人或一百万？我喜欢更好地沟通，酒吧由人组成。像流动数据的例子和中间图像：

Answer 1

我们可以使用geom_dotplot。正如您所提到的，点图通常用于小计数，但我们可以聚合数据。在下面的代码中，我使用mutate(Est.pop = round(Est.pop, digits = -3)/1000)将Est.pop四舍五入为千，然后除以1000.之后，我重复每个Age.group我在{{{I}中计算的次数1}}列。最后，我使用Est.pop绘制数据。每个点代表1000人。 y轴是隐藏的，因为我认为这种可视化主要集中在点数上。

geom_dotplot

数据

# Load package
library(tidyverse)

# Process the data
dt2 <- dt %>%
  mutate(Est.pop = round(Est.pop, digits = -3)/1000) %>%
  split(f = .$Age.group) %>%
  map_df(function(x) x[rep(row.names(x), x$Est.pop[1]), ])

# Plot the data
ggplot(dt2, aes(x = Age.group)) +
  geom_dotplot() +
  scale_y_continuous(NULL, breaks = NULL)

Answer 2

也许你可以为每个Est.pop和情节生成从零到Age.group的值。但我确信还有其他更好的方法。

library(reshape2)

df2 = dcast(data = df, Year~Age.group, value.var = "Est.pop")

df3 = do.call(rbind, lapply(2:NCOL(df2), function(i)
data.frame(Age.group = names(df2)[i], Est.pop = seq(0, df2[,i], 200))))

ggplot(data = df3[df3$Age.group != "5-9",],
   aes(x = factor(Age.group), y = Est.pop)) +
geom_point()

数据

df = structure(list(Year = c(2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L, 2015L), Age.group = c("0-4", "5-9", "10-14", "15-19", "20-24", "25-29", "30-34", "35-39", "40-44", "45-49", "50-54", "55-59", "60-64", "65-69", "70-74", "75-79", "80+"), Est.pop = c(7753L, 7233L, 6623L, 6982L, 8817L, 8674L, 7947L, 7166L, 6653L, 6011L, 5469L, 4623L, 3310L, 1896L, 1375L, 1162L, 1878L)), .Names = c("Year", "Age.group", "Est.pop"), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17"))

如何在ggplot2中创建包含大量值的点图

2 个答案: