我已使用tidyverse和lubridate软件包绘制多线时间序列,以显示时间序列的不同年份。现在,我想再添加一行以显示平均年份,但是R会引发错误。
原始数据是一个由三列组成的每日数据的时间序列:日期,值,一年中的某天
str(datos)
'data.frame': 13379 obs. of 3 variables:
$ fecha: Date, format: "1982-01-01" "1982-01-02" "1982-01-03" ...
$ SSTm : num 15.7 15.9 16.2 16.1 16 ...
$ day : num 1 2 3 4 5 6 7 8 9 10 ...
然后我使用此代码来安排绘图数据
df <- as_tibble(datos) %>%
rename_all(tolower) %>%
mutate(fecha = ymd(fecha))
# Define the plot: all years with different colour
p <- df %>%
mutate(
year = factor(year(fecha)), # use year to define separate curves
date = update(fecha, year = 1) # use a constant year for the x-axis
) %>%
ggplot(aes(date, sstm, color = year)) +
scale_x_date(date_breaks = "1 month", date_labels = "%m") + xlab(" ") +
ylab("SST (ºC)") + theme_bw() + ggtitle("Mediterranean daily SST average (1982-2018)")
并绘制p + geom_line()
$ pdata的结构
head(p$data)
# A tibble: 6 x 5
fecha sstm day year date
<date> <dbl> <dbl> <fct> <date>
1 1982-01-01 15.7 1 1982 1-01-01
2 1982-01-02 15.9 2 1982 1-01-02
3 1982-01-03 16.2 3 1982 1-01-03
4 1982-01-04 16.1 4 1982 1-01-04
5 1982-01-05 16.0 5 1982 1-01-05
6 1982-01-06 15.9 6 1982 1-01-06
> str(p$data)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 13379 obs. of 5 variables:
$ fecha: Date, format: "1982-01-01" "1982-01-02" "1982-01-03" ...
$ sstm : num 15.7 15.9 16.2 16.1 16 ...
$ day : num 1 2 3 4 5 6 7 8 9 10 ...
$ year : Factor w/ 37 levels "1982","1983",..: 1 1 1 1 1 1 1 1 1 1 ...
$ date : Date, format: "1-01-01" "1-01-02" "1-01-03" ...
然后,平均年数是根据datos按天和平均分组的。
datos.daily.mean <- datos %>%
group_by(day) %>%
summarise(sstm = mean(SSTm))
datos.daily.mean$fecha<-as.Date(datos.daily.mean$day, origin = "1970-01-01")
datos.daily.mean的数据结构
head(datos.daily.mean)
# A tibble: 6 x 3
day sstm fecha
<dbl> <dbl> <date>
1 1 16.2 1970-01-02
2 2 16.2 1970-01-03
3 3 16.2 1970-01-04
4 4 16.1 1970-01-05
5 5 16.1 1970-01-06
6 6 16.0 1970-01-07
> str(datos.daily.mean)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 366 obs. of 3 variables:
$ day : num 1 2 3 4 5 6 7 8 9 10 ...
$ sstm : num 16.2 16.2 16.2 16.1 16.1 ...
$ fecha: Date, format: "1970-01-02" "1970-01-03" "1970-01-04" ...
datos.daily.mean可以用
绘制ggplot() + geom_line(data=datos.daily.mean, aes(x=fecha, y= sstm)) + scale_x_date(date_breaks = "1 month", date_labels = "%m") + xlab(" ")
但是,如果我尝试同时加入这两个图,则通过为平均年份添加新的geom_line,会收到有关日期格式的错误消息
p + geom_line() + geom_line(data=datos.daily.mean, aes(x=fecha, y= sstm),colour='blue')
charToDate(x)中的错误:字符串不在标准中 明确的格式
但是我认为两个数据集中的日期都是标准格式。任何想法/帮助将不胜感激。 谢谢
答案 0 :(得分:0)
我找到了解决问题的方法。使用fecha
和date
这两个时间数据,但格式不同,是我的错误。我在datos.daily.mean
上加上了日期和变异句。
q <- datos.daily.mean %>%
mutate(
year = factor(year(fecha)), # use year to define separate curves
date = update(fecha, year = 1) # use a constant year for the x-axis
)
和
p + geom_line(aes(group = year), color = "black", alpha = 0.1) +
geom_line(data = function(x) filter(x, year == 2018), size = 1.5) +
geom_line(data=q, aes(x=date, y= sstm),colour='black')