在不同的网格上绘制变量子集

时间:2019-05-17 10:49:54

标签: r ggplot2

我有此数据:

library(dplyr)

samp %>%
head(5)
# A tibble: 929 x 3
    time  city sales
   <int> <dbl> <dbl>
 1     0     1   248
 2     0     2   187
 3     0     3   459
 4     0     5  1422
 5     0     7   196
 6     0     8   397

我想用线图绘制每个城市的销售额与时间的关系。总体上有31个城市。如果我将它们绘制出来,那会有些混乱。

library(ggplot)

samp %>%
  ggplot(aes(x = time, y = sales, color = factor(city))) +
  geom_line() 

enter image description here

我的目标是在每个图上绘制6个城市,然后将图布置在6个网格(31/6)上。可以使用facet_wrap。但是,每个网格中只有一个城市。如何在每个网格中插入6个城市?那么仅剩6个网格和6个城市?

samp %>%
      ggplot(aes(x = time, y = sales, color = factor(city))) +
      facet_wrap(.~city) +
      geom_line() 

enter image description here

structure(list(time = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), city = c(1, 2, 3, 
5, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20, 21, 22, 23, 
25, 26, 29, 31, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 16, 
17, 18, 19, 20, 21, 22, 23, 25, 26, 29, 31, 1, 2, 3, 4, 5, 7, 
8, 9, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20, 21, 22, 23, 25, 
26, 29, 30, 31, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 16, 
17, 18, 19, 20, 21, 22, 23, 25, 26, 29, 30, 31), sales = c(248, 
187, 459, 1422, 196, 397, 438, 636, 616, 729, 648, 7291, 488, 
520, 370, 417, 826, 726, 895, 426, 797, 839, 589, 452, 135, 221, 
496, 187, 1594, 269, 453, 466, 664, 656, 784, 683, 8023, 545, 
580, 424, 459, 855, 679, 975, 422, 694, 899, 528, 472, 237, 272, 
563, 362, 2078, 320, 561, 565, 814, 829, 1095, 878, 10403, 705, 
755, 630, 501, 1193, 884, 1416, 533, 1071, 1353, 729, 2269, 583, 
168, 180, 63, 252, 1137, 201, 466, 299, 523, 564, 616, 611, 7259, 
483, 489, 371, 355, 753, 526, 918, 445, 683, 746, 485, 1703, 
408)), row.names = c(NA, -101L), class = c("grouped_df", "tbl_df", 
"tbl", "data.frame"), vars = "time", drop = TRUE, indices = list(
    0:23, 24:48, 49:74, 75:100), group_sizes = c(24L, 25L, 26L, 
26L), biggest_group_size = 26L, labels = structure(list(time = 0:3), row.names = c(NA, 
-4L), class = "data.frame", vars = "time", drop = TRUE))

1 个答案:

答案 0 :(得分:0)

鉴于您要在不同的网格上绘制变量子集以提高可读性的问题,我将遵循上面两个评论者的建议并将其方法结合起来。

首先创建按销售数字划分的城市组(sales_group) 然后添加另一个随机组(city_group)以减少每个网格的城市数。 然后通过

进行打印

facet_grid(sales_group ~ city_group, scales = "free_y")

下一步,您可以考虑对城市进行着色/命名,但是我想您可以从这里获取。

祝你好运。

samp %>%
    group_by(city) %>%
    mutate(sales_max = max(sales),
           sales_group = case_when(
               sales_max < 550 ~ "sales low",
               sales_max > 550 & sales_max < 750 ~ "sales medium",
               sales_max > 750 & sales_max < 1100 ~ "sales high",
               sales_max > 1100 ~ "sales very high"
           )) %>% 
    ungroup()  %>% 
    group_by(sales_group)  %>% 
    mutate(city_group = cut_number(city, 2, label = c("group 1", "group 2"))) %>% 
    ggplot(aes(x = time, y = sales, color = factor(city))) +
    facet_grid(sales_group ~ city_group, scales = "free_y") +
    geom_line()