在ggplot2中绘制随时间变化的多行;希望更好地区分界线

时间:2019-06-25 13:41:56

标签: r ggplot2 time linegraph

我主要是发布信息,因为我真的认为我已经使这一问题复杂化了。我将随着时间的推移绘制12条不同线的图。我希望每一天在x轴上都用“标题”表示。

我已经尝试了一些解决方案以及我的“作品”,但这并不是很好。忽略我在那里的占位符,我希望在某些地方增加它们,并显示人们的位置更加清晰。我的代码似乎有些冗长;也许有更好的方法可以做到这一点。

riddle_log <- structure(list(date = structure(c(1559779200, 1559865600, 1560124800, 
1560211200, 1560297600, 1560384000, 1560470400, 1560470400, 1560470400, 
1560729600, 1560729600, 1560816000, 1560902400, 1560988800, 1561075200, 
1561334400), class = c("POSIXct", "POSIXt"), tzone = "UTC"), 
    title = c("The Midget", "Bowling Balls", "Poisonous Ice", 
    "Dog Crosses River", "Camel Race", "Two Masked Men", "The Cabin", 
    "Black Truck", "Burglary", "Japanese Ship", "Haunted Floor", 
    "East and West", "Filling the Room", "Untied", "Window Jumper", 
    "Window Faller"), Brigid = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0), Carly = c(0, 1, 1, 1, 2, 2, 3, 3, 3, 3, 
    3, 3, 3, 3, 3, 3), Christian = c(1, 1, 1, 1, 1, 1, 1, 1, 
    2, 2, 3, 3, 3, 3, 4, 4), Daniel = c(0, 0, 0, 0, 0, 1, 1, 
    2, 2, 2, 2, 3, 3, 3, 3, 3.5), Jess = c(0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Luke = c(0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Mara = c(0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Marcus = c(0, 0, 0, 0, 0, 
    0, 0, 0, 0, 1, 2, 2, 3, 3, 3, 3.5), Nassim = c(0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Nathalie = c(0, 0, 1, 
    2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Neil = c(0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA, 
-16L), class = c("tbl_df", "tbl", "data.frame"))

library(tidyverse)
library(ggthemes)

line1 <- riddle_log %>% 
  select(date, Brigid)

line2 <- riddle_log %>% 
  select(date, Carly)

line3 <- riddle_log %>% 
  select(date, Christian)

line4 <- riddle_log %>% 
  select(date, Daniel)

line5 <- riddle_log %>% 
  select(date, Jess)

line6 <- riddle_log %>% 
  select(date, Luke)

line7 <- riddle_log %>% 
  select(date, Mara)

line8 <- riddle_log %>% 
  select(date, Marcus)

line9 <- riddle_log %>% 
  select(date, Nassim)

line10 <- riddle_log %>% 
  select(date, Nathalie)

line11 <- riddle_log %>% 
  select(date, Neil)

ggplot() + 
  geom_line(data = line1, aes(x = date, y = Brigid, color = "a")) +
  geom_line(data = line2, aes(x = date, y = Carly, color = "b")) +
  geom_line(data = line3, aes(x = date, y = Christian, color = "c")) +
  geom_line(data = line4, aes(x = date, y = Daniel, color = "d")) +
  geom_line(data = line5, aes(x = date, y = Jess, color = "e")) +
  geom_line(data = line6, aes(x = date, y = Luke, color = "f")) +
  geom_line(data = line7, aes(x = date, y = Mara, color = "g")) +
  geom_line(data = line8, aes(x = date, y = Marcus, color = "h")) +
  geom_line(data = line9, aes(x = date, y = Nassim, color = "i")) +
  geom_line(data = line10, aes(x = date, y = Nathalie, color = "j")) +
  geom_line(data = line11, aes(x = date, y = Neil, color = "k")) +
  scale_color_manual(name = "Analysts", 
                     values = c("a" = "blue", "b" = "red", "c" = "orange", "d" = "black",
                                "e" = "steelblue", "f" = "blue", "g" = "blue", "h" = "blue",
                                "i" = "blue", "j" = "blue", "k" = "blue")) +
  xlab('Date') +
  ylab('Wins') +
  ggtitle(" NAME ") 

#+
 # scale_x_date(breaks = as.Date(c("2019-05-01", "2019-08-15")))



 # scale_x_discrete(name, breaks, labels, limits)

简而言之,我想添加四件事: -所有日期均在x轴上表示。周末被跳过了,但我不希望他们在情节上有差距,而应被视为连续几天。 -如果有可能以某种方式合并标题,那将是一件很酷的事情,除了我一直在努力思考为什么自从几天以来拥有多个标题。 -一种更出色的方式来查看所有生产线的进度,而不是这里发生的不良重叠 点。

如果有任何主题更适合此类问题,那么我很乐意为您服务。

2 个答案:

答案 0 :(得分:1)

这是一个转换为“长”数据以简化ggplot的示例。我还添加了geom_jitter图层,以便于查看重叠的日子。

riddle_log %>%
  tidyr::gather(Analyst, Wins, -c(date, title)) %>%
  ggplot(aes(x = date, y = Wins, color = Analyst)) +
  geom_line() +
  geom_jitter( width = 0, shape = 21, alpha = 0.7) + # one way to show daily overlap
  scale_color_manual(name = "Analysts", 
                     values = c("Brigid" = "blue", "Carly" = "red", 
                                "Christian" = "orange", "Daniel" = "black",
                                "Jess" = "steelblue", "Luke" = "blue", 
                                "Mara" = "blue", "Marcus" = "blue",
                                "Nassim" = "blue", "Nathalie" = "blue", 
                                "Neil" = "blue"))

enter image description here

答案 1 :(得分:1)

首先,您的代码“有点麻烦”是正确的。要利用ggplot,您应该将数据保存在tidy ("tall") format中,其中一个变量表示“人”,另一个变量表示人的分数。使用tidyr软件包中的gather()可以轻松实现这一点:

riddle_log2 <- riddle_log %>%
  tidyr::gather("Analyst", "Wins", Brigid:Neil)

现在数据是ggplot的首选格式,我们可以更轻松地绘制它们,如下所示:

ggplot(riddle_log2, aes(x = date, y = Wins, color = Analyst)) + 
  geom_line(size = 2)

ggplot with default colors and equal line widths 但是,许多行是相互重叠的。我们可以通过用粗线绘制第一个人(首先被绘制并最终在其他线之后)来使图更好,例如:

ggplot(riddle_log2, aes(x = date, y = Wins, color = Analyst)) + 
  geom_line(aes(size = Analyst)) +
  scale_size_manual(values = seq(4, 1, length = 11))

ggplot with default colors and different line widths 现在,这稍微好点了。接下来,我们可以改善颜色。有大量适用于R的调色板。在这种情况下,我经常使用the palettes of Paul Tol

tol_colors = c("#332288", "#6699CC", "#88CCEE", "#44AA99", "#117733", "#999933",   
               "#DDCC77", "#661100", "#CC6677", "#882255", "#AA4499")
ggplot(riddle_log2) + 
  geom_line(aes(x = date, y = Wins, color = Analyst, size = Analyst)) +
  scale_size_manual(values = seq(5, 1, length = 11)) +
  scale_color_manual(values = tol_colors)

ggplot with custom colors line widths 现在,这并不完美,但这是一种改进。您应该考虑的是使用facet_wrap()将图分成多个子图:

gg <- ggplot(riddle_log2, aes(x = date, y = Wins, color = Analyst)) + 
  geom_line(size = 2) +
  scale_color_manual(values = tol_colors) + 
  facet_wrap(~Analyst) 
gg

ggplot split up into one subplot per person 我认为,在这种情况下,这是一个更好的选择。

接下来,您还希望x轴显示所有日期。每天显示的空间太少了,因此我将在第二天显示标签:

gg + 
  scale_x_datetime(breaks = "2 day", date_labels = "%d. %b") +
  theme(axis.text.x = element_text(hjust = 0, angle = -45))

ggplot with iproved date axis

如您所见,格式化标签并不是很简单,但是却很灵活。尤其是关于如何显示时间/日期的代码相当混乱。在这种情况下,%d表示“日期”,%m表示“缩写月份”。通过运行?strptime可以找到其他代码。

最后,每次“胜利”得分增加时,我们都将添加当天的“标题”。我们首先添加变量'Wins_increase'来增加Wins:

riddle_log2 <- riddle_log2 %>%
  arrange(Analyst, date) %>%                # Make sure sortings is correct
  group_by(Analyst) %>%                     # 'Wins_increase' will be calculated for every Analyst 
  mutate(Wins_increase = Wins - lag(Wins))  # How much 'Wins' have increased since last day

然后我们使用geom_text()添加旋转标签:

gg + scale_x_datetime(breaks = "2 day", date_labels = "%d. %b") +  # as before
  theme(axis.text.x = element_text(hjust = 0, angle = -45)) +      # as before
  geom_text(data = riddle_log2 %>% filter(Wins_increase > 0),      # Pick only where "Wins" is increasing
            aes(y = Wins + 0.3, label = title),                    # We add 0.3 to lift the labels a bit
            hjust = 0, angle = 90, size = 2)                       # Left-adjust and rotate labels

ggplot with labels added

接下来要解决的是Marcus的标签之间存在重叠(因为他在同一天赢了两次)。可以使用ggrepel软件包对此进行修复。