在R中绘制多年的累积数据

时间:2020-06-25 10:38:49

标签: r ggplot2

我有每日降雨量数据,已使用以下代码将其转换为年度累计值

library(seas)
library(data.table)
library(ggplot2)

#Loading data
data(mscdata)
dat <- (mksub(mscdata, id=1108447))
dat$julian.date <- as.numeric(format(dat$date, "%j"))
DT <- data.table(dat)
DT[, Cum.Sum := cumsum(rain), by=list(year)]

df <- cbind.data.frame(day=dat$julian.date,cumulative=DT$Cum.Sum)

但是当我尝试绘制时,它给了我奇怪的输出

#Plotting using base R
df <- df[order(df[,1]),]
plot(df$day, df$cumulative, type="l", xlab="Day", ylab="Cumulative rainfall")

enter image description here

与我正在使用ggplot2一样,

#Plotting using ggplot2
ggplot(df, aes(x = day, y = cumulative)) + geom_line()

enter image description here

但是我想获得每年的线条可能是灰色,而多年来的平均值是红色,如下图所示

enter image description here

如何实现?

2 个答案:

答案 0 :(得分:1)

因此,如您所见,您在group中丢失了geom_line。如果没有groupggplot将连接所有与axis.x共享的点。 这是一个将“年”添加为group并计算每天平均值的示例。

library(reshape2)

data(mscdata)
dat <- (mksub(mscdata, id=1108447))
dat$julian.date <- as.numeric(format(dat$date, "%j"))
DT <- data.table(dat)
DT[, Cum.Sum := cumsum(rain), by=list(year)]

dt <- cbind.data.frame(day=dat$julian.date,cumulative=DT$Cum.Sum,year=DT$year)
TB <- melt(dt, id.vars = c('day','year'))
Mean_l = colMeans(reshape(TB[c("day",'year','value')],timevar='day',idvar = 'year', direction = 'wide'),na.rm = T)
Mean_l= Mean_l[-1]
Mean_l <- data.frame(day=c(1:length(Mean_l)),Mean_l)

TB_f <- data.frame(TB,avr=Mean_l$Mean_l[match(TB$day,Mean_l$day)])

ggplot(TB_f,aes(day,value))+ geom_line(aes(group=year))+ geom_line(aes(y=avr),color='red')+ theme_light()

enter image description here

答案 1 :(得分:1)

添加组美观性以告知ggplot按年份进行分组,并添加stat_summary绘制红线(不进行分组)。

library(ggplot2)

ggplot(DT, aes(x = julian.date, y = Cum.Sum, group=year)) + 
  geom_line(col="grey") +
  labs(x="Date", y="Cumulative sum") +
  stat_summary(aes(group=NULL), fun="mean", geom="line", col="red", se="none", lwd=1)

enter image description here

对于基本图形,它涉及更多一点:

par(mar=c(4,3.5,1,1))
plot(df$day, df$cumulative, type="n", xlab="Day", ylab="Cumulative rainfall", las=1)
grid()

lapply(split(df, df$year), FUN=function(x) 
     with(x, lines(day, cumulative, col="grey", lwd=0.5)))

with(aggregate(cumulative~day, FUN=mean, data=df), 
    lines(x=day, y=cumulative, lwd=2, col="red"))