我有以下数据子集(从csv文件导入):
treas <- read.csv(file = 'treas.csv', header = TRUE, stringsAsFactors = FALSE)
DATES X2YR X3YR X5YR X7YR X10YR X30YR
1 6/3/2014 0.41 0.85 1.65 2.18 2.60 3.43
2 6/4/2014 0.41 0.85 1.65 2.20 2.61 3.45
3 6/5/2014 0.40 0.82 1.63 2.17 2.59 3.44
4 6/6/2014 0.41 0.86 1.66 2.19 2.60 3.44
5 6/9/2014 0.43 0.88 1.69 2.22 2.62 3.45
6 6/10/2014 0.45 0.93 1.71 2.24 2.64 3.47
当我在treas_yields数据帧上调用head()时,我看到:
X2YR X3YR X5YR X7YR X10YR X30YR
1 0.41 0.85 1.65 2.18 2.60 3.43
2 0.41 0.85 1.65 2.20 2.61 3.45
3 0.40 0.82 1.63 2.17 2.59 3.44
4 0.41 0.86 1.66 2.19 2.60 3.44
5 0.43 0.88 1.69 2.22 2.62 3.45
6 0.45 0.93 1.71 2.24 2.64 3.47
现在,我在此框架的右侧创建几列(实际上,只是创建预先存在的值的差异):
为此,我使用:
treas_yields$2s30s = (treas_yields$X30YR - treas_yields$X2YR) * 100
treas_yields$2s10s = (treas_yields$X10YR - treas_yields$X2YR) * 100
再次调用head(treas_yields),我们看到:
DATES X2YR X3YR X5YR X7YR X10YR X30YR X2s30s X2s10s
1 6/3/2014 0.41 0.85 1.65 2.18 2.60 3.43 302 219
2 6/4/2014 0.41 0.85 1.65 2.20 2.61 3.45 304 220
3 6/5/2014 0.40 0.82 1.63 2.17 2.59 3.44 304 219
4 6/6/2014 0.41 0.86 1.66 2.19 2.60 3.44 303 219
5 6/9/2014 0.43 0.88 1.69 2.22 2.62 3.45 302 219
6 6/10/2014 0.45 0.93 1.71 2.24 2.64 3.47 302 219
现在,我想绘制X2s30s和X2s10s列的时间序列。为此,我使用:
ggplot(treas_yields, aes(x = DATES)) +
geom_line(aes(y = X2s30s), color = 'red') +
geom_line(aes(y = X2s10s), color = 'blue')
但是,我看到以下错误消息:
geom_path: Each group consists of only one observation. Do you need to adjust the group
aesthetic?
geom_path: Each group consists of only one observation. Do you need to adjust the group
aesthetic?
当我调用str(treas_yields)时,数据似乎是数字的:
'data.frame': 1251 obs. of 9 variables:
$ DATES : chr "6/3/2014" "6/4/2014" "6/5/2014" "6/6/2014" ...
$ X2YR : num 0.41 0.41 0.4 0.41 0.43 0.45 0.44 0.42 0.45 0.49 ...
$ X3YR : num 0.85 0.85 0.82 0.86 0.88 0.93 0.91 0.88 0.93 0.95 ...
$ X5YR : num 1.65 1.65 1.63 1.66 1.69 1.71 1.7 1.66 1.7 1.71 ...
$ X7YR : num 2.18 2.2 2.17 2.19 2.22 2.24 2.23 2.17 2.21 2.21 ...
$ X10YR : num 2.6 2.61 2.59 2.6 2.62 2.64 2.65 2.58 2.6 2.61 ...
$ X30YR : num 3.43 3.45 3.44 3.44 3.45 3.47 3.47 3.41 3.41 3.4 ...
$ X2s30s: num 302 304 304 303 302 302 303 299 296 291 ...
$ X2s10s: num 219 220 219 219 219 219 221 216 215 212 ...
任何想法导致这些错误的原因是什么?谢谢!
编辑:此问题是由于日期格式不正确引起的。 导入csv文件后,有必要将字符的日期转换为实际日期(如@aosmith在下面的注释中建议的那样)。
修改后的代码是:
library(ggplot2)
library (lubridate)
# import data
treas <- read.csv(file = 'treas.csv', header = TRUE, stringsAsFactors = FALSE)
# convert dates (necessary for plotting later)
treas_yields$DATES = mdy(treas_yields$DATES)
# add columns
treas_yields$2s30s = (treas_yields$X30YR - treas_yields$X2YR) * 100
treas_yields$2s10s = (treas_yields$X10YR - treas_yields$X2YR) * 100
# plot newly added columns
ggplot(treas_yields, aes(x = DATES)) +
geom_line(aes(y = X2s30s), color = 'red') +
geom_line(aes(y = X2s10s), color = 'blue')