例如,如果我们有两个时间序列a
和b
:
time <- seq(as.Date("1999-06-15"),as.Date("2008-06-15") , by= "years")
a <- c(22.3,24.1,35,35,35.9,39.2,34.8,31.5,29.1,25.8)
b <- c(22,24.9,31,34,37.5,36.3,32.1,29.7,28.6,23.9)
plot(as.Date(time),a,type="l",xlab="Date",ylab="T(°C)")
lines(as.Date(time),b,col=2)
有没有办法让我的情节看起来像图像示例:
答案 0 :(得分:3)
您可以使用ggplot2
的{{1}}和geom_line
。
geom_col
第一步,我创建了一个新的数据集,其中包含变量library(tidyverse)
DF_bar <- mutate(DF, diff_a_b = a - b)
DF %>%
gather(key, value, a, b) %>%
ggplot(., aes(time)) +
geom_line(aes(y = value, col = key)) +
geom_col(data = DF_bar, aes(y = diff_a_b)) # or geom_bar(data = DF_bar, aes(y = diff_a_b), stat = "identity")
,这就是diff_a_b
和a
之间的区别。
接下来,我将您的数据从宽到长整形,以便我们可以将列b
映射到key
中的颜色美观度。最后,我在geom_line
中使用DF_bar
来绘制geom_col
。
数据
diff_a_b
答案 1 :(得分:1)
不幸的是,the first answer by markus (before the edit)包含一个重大缺陷,该缺陷导致显示残渣的条形图是预期的两倍。当根据key
对条的填充进行着色时,这将立即可见:
library(dplyr)
library(tidyr)
library(ggplot2)
data_frame(time, a, b) %>%
mutate(diff_a_b = a - b) %>%
gather(key, value, a, b) %>%
ggplot(., aes(time)) +
geom_line(aes(y = value, color = key)) +
geom_col(aes(y = diff_a_b, fill = key))
根本原因是diff_a_b
从宽格式转换为长格式时未被视为变量:
data_frame(time, a, b) %>%
mutate(diff_a_b = a - b) %>%
gather(key, value, a, b)
因此,diff_a_b
的每个time
值都会出现两次:
# A tibble: 20 x 4 time diff_a_b key value <date> <dbl> <chr> <dbl> 1 1999-06-15 0.3 a 22.3 2 2000-06-15 -0.800 a 24.1 3 2001-06-15 4 a 35 4 2002-06-15 1 a 35 5 2003-06-15 -1.6 a 35.9 6 2004-06-15 2.9 a 39.2 7 2005-06-15 2.70 a 34.8 8 2006-06-15 1.8 a 31.5 9 2007-06-15 0.5 a 29.1 10 2008-06-15 1.9 a 25.8 11 1999-06-15 0.3 b 22 12 2000-06-15 -0.800 b 24.9 13 2001-06-15 4 b 31 14 2002-06-15 1 b 34 15 2003-06-15 -1.6 b 37.5 16 2004-06-15 2.9 b 36.3 17 2005-06-15 2.70 b 32.1 18 2006-06-15 1.8 b 29.7 19 2007-06-15 0.5 b 28.6 20 2008-06-15 1.9 b 23.9
由于geom_col()
的默认值为position = "stack"
,因此这两个值相互叠加。
如果位置更改为"dodge"
,则markus' answer将显示预期结果
data_frame(time, a, b) %>%
mutate(diff_a_b = a - b) %>%
gather(key, value, a, b) %>%
ggplot(., aes(time)) +
geom_line(aes(y = value, color = key)) +
geom_col(aes(y = diff_a_b), position = "dodge")
另一种解决方法是使用geom_linerange()
,其中每个线段将绘制两次:
data_frame(time, a, b) %>%
mutate(diff_a_b = a - b) %>%
gather(key, value, a, b) %>%
ggplot(., aes(time)) +
geom_line(aes(y = value, color = key)) +
geom_linerange(aes(ymin = 0, ymax = diff_a_b), size = 3)
恕我直言,正确的(“整洁”)方法是在重塑时将diff_a_b
视为第三变量/时间序列,并在创建几何图形时使用data
参数:
data_frame(time, a, b) %>%
mutate(diff_a_b = a - b) %>%
gather(, , -time) %>%
ggplot(aes(x = time, y = value)) +
geom_line(aes(col = key), data = function(x) filter(x, key != "diff_a_b")) +
geom_col(data = function(x) filter(x, key == "diff_a_b"))
data.table
和ggplot2
对于那些喜欢data.table
进行数据处理的人:
library(data.table)
library(ggplot2)
long <- data.table(time, a, b)[
, diff_a_b := a - b][
, melt(.SD, "time")]
ggplot() + aes(time, value) +
geom_line(aes(color = variable), data = long[variable != "diff_a_b"]) +
geom_col(data = long[variable == "diff_a_b"])