ggplot隐藏没有x值的空白

时间:2018-11-18 21:56:27

标签: r ggplot2 time-series

是否有使用ggplot2在时间序列图中隐藏空白的有效方法? 我有以下图表,可以看到,没有12月3日至12月5日的数据。有没有办法隐藏图表的这一部分?

enter image description here

我目前正在使用以下代码来生成该图:

ggplot(data = do.call(rbind.data.frame, combinedOutput[,2])) +
  geom_line(aes(x = Date, y = Return)) +
  geom_line(aes(x = Date, y = PredReturn), colour = "red") +
  facet_wrap(~Ticker, ncol = 2, scales = "free") +
  theme_light() + 
  theme(panel.spacing.y = unit(0.3, "cm"), 
        strip.background = element_rect(fill = "white"), 
        strip.text = element_text(colour = "black")) + 
  labs(x = NULL, y = "Daily Return in \\%")

这是原始数据的样子。在2016-12-02 16:00:00和2016-12-05 09:30:00之间没有NA。

enter image description here

非常感谢!

1 个答案:

答案 0 :(得分:0)

我首先将其视为数据争用问题,然后是ggplot部分。

由于问题中没有示例数据,所以让我们模拟一下:

library(dplyr)

set.seed(12345)
data <- data.frame(
  Date = seq.POSIXt(from = ISOdate(2018, 1, 1),
                    to = ISOdate(2018, 5, 1),
                    by = "hour")
) %>%
  mutate(Return = rnorm(n = n()),
         PredReturn = rnorm(n = n()))
data$Date[c(220:350,
            593:820,
            2100:2500)] <- NA
data <- na.omit(data)

#which creates a dataset with 3 distinctive gaps in its time periods
ggplot(data,
       aes(x = Date, group = 1)) +
  geom_line(aes(y = Return)) +
  geom_line(aes(y = PredReturn), color = "red") +
  theme_light()

plot with time gaps

我们可以通过比较连续时间戳之间的时差来确定时间间隔。在这里,我使用的逻辑将间隙定义为大于所有时间差中值的任何时间差。您可能要根据情况将其更改为其他值(例如2天或1周?)

data2 <- data %>%
  arrange(Date) %>%
  mutate(date.diff = c(NA, diff(Date))) %>%
  mutate(is.gap = !is.na(date.diff) & date.diff > median(date.diff, na.rm = TRUE)) %>%
  mutate(period.id = cumsum(is.gap))

> head(data2)
                 Date     Return PredReturn date.diff is.gap period.id
1 2018-01-01 12:00:00  0.5855288 -0.7943254        NA  FALSE         0
2 2018-01-01 13:00:00  0.7094660  1.8875074         1  FALSE         0
3 2018-01-01 14:00:00 -0.1093033  0.5881879         1  FALSE         0
4 2018-01-01 15:00:00 -0.4534972  1.1556793         1  FALSE         0
5 2018-01-01 16:00:00  0.6058875 -0.8743878         1  FALSE         0
6 2018-01-01 17:00:00 -1.8179560  0.2586568         1  FALSE         0

现在,每个period.id值都对应一个数据子集,其行内没有主要时间差。我们可以通过将其转换为长格式来进一步处理这些数据:

data2 <- data2 %>%
  select(-date.diff, -is.gap) %>% # drop unneeded columns
  tidyr::gather(color, y, -Date, -period.id) %>%
  mutate(color = factor(color,
                        levels = c("Return", "PredReturn")))

> head(data2)
                 Date period.id  color          y
1 2018-01-01 12:00:00         0 Return  0.5855288
2 2018-01-01 13:00:00         0 Return  0.7094660
3 2018-01-01 14:00:00         0 Return -0.1093033
4 2018-01-01 15:00:00         0 Return -0.4534972
5 2018-01-01 16:00:00         0 Return  0.6058875
6 2018-01-01 17:00:00         0 Return -1.8179560

将此数据传递到ggplot(),按时间比例自由缩放各个方面,并且您将消除上面较早的图中的空格:

p <- ggplot(data2,
       aes(x = Date, y = y, color = color)) +
  geom_line() +
  facet_grid(~ period.id, scales = "free_x", space = "free_x") +
  scale_color_manual(values = c("Return" = "black",
                                "PredReturn" = "red")) +
  theme_light()

p

faceted plot

对情节美学的进一步调整可以完全掩盖空白,尽管我提醒您不要过分极端,不要让您的目标受众清楚地知道 的时间间隔,因为这可能会受到限制误解:

p +
  scale_x_datetime(expand = c(0, 0),             # remove space within each panel
                   breaks = "5 days") +          # specify desired time breaks
  theme(panel.spacing = unit(0, "pt"),           # remove space between panels
        axis.text.x = element_text(angle = 90))  # rotate x-axis text

faceted plot without gaps