黄土散点图时间序列错误

时间:2016-05-25 22:37:09

标签: r plot time-series loess

我正在尝试为时间序列的子集创建黄土曲线。应用黄土时,所有子集似乎都有类似的问题,因此问题可能在我的df中,但我不确定如何解决它。

数据可在此处获取:https://dl.dropbox.com/s/zy6b5mjcu7uteof/data_all_PAR_max.csv?dl=0

此函数是较大函数的一部分,因此传递一些值并在此处定义以帮助重现错误:

sumfile <- read.csv('https://dl.dropbox.com/s/zy6b5mjcu7uteof/data_all_PAR_max.csv')
codename = "EXEM"
descriptor = "max"
radtype = "PAR"
totYrs = c(1997:2015)
ylbl = expression("PAR " ~ (mu ~ mol ~ photons ~ m^{-2} ~ s^{-1}))
clr = "blue"

group <- melt(sumfile,  id.vars = 'date', variable.name = 'series')
setnames(group, old = c('date','series','value'), new = c('Date','Year',radtype))
group$Date <- as.Date(group$Date)
# group <- na.omit(group) # uncommenting resolves error!
o <- order(group$Date)
lo <- loess(PAR ~ as.numeric(Date), span = 0.25, data=group)

plot(group$Date,group$PAR,pch=19,cex=0.1, col=clr,
     xlab ="Date",
     ylab = ylbl,
    main = paste('Loess curve for', descriptor, radtype, 'from', min(totYrs), 'to',
              max(totYrs), '\nmeasured at', codename, 'met',sep=' '))
lines(group$Date[o], lo$fitted[o], col='red', lwd=1)

lines替换points可以更好地了解错误

points(group$Date[o], lo$fitted[o], col='red', lwd=1)

情节看起来应该是这样的:

loessErrorPlot

幻像点似乎是数据集中NA的伪影。

我创建了一个循环来检查每个发现更多错误的年份。

for (i in totYrs) {
  tryCatch({         
    yval <- paste(radtype, i, descriptor,sep='_')

    sumfile$date <- as.Date(sumfile$date)
    lo_ <- eval(parse(text = paste("loess(", yval, "~ as.numeric(date),
                                 span = 0.25, data=sumfile)")))
    oo <- order(sumfile$date)
    plot(sumfile$date, eval(parse(text = paste("sumfile$",yval))),
         pch=19,cex=0.1, col=clr,
         xlab ="Date",
         ylab = ylbl,
         main = paste('Loess curve for', descriptor, radtype, 'measured at\n',
                      codename, 'met during', i, '/', i+1, 'field season',sep=' '))
    lines(sumfile$date[oo], lo_$fitted[oo], col='red', lwd=1)  
  }, error=function(e){print("One or more years was not plotted because there was no data")})
}

循环为每年创建一个图表,并说明曲线平滑看起来如何工作多年而不是其他年份。

设置loess(y ~ x, na.action=na.exclude)似乎对最终结果没有任何影响。将group <- na.omit(group)添加到df之前的融合loess()可以解决该数据框的错误,但在审核各个年份时问题似乎仍然存在。这是一个例子:

sumfile$date <- as.Date(sumfile$date)
no_na <- na.omit(subset(sumfile, select=c(date,PAR_2013_max)))
lo13 <- loess(PAR_2013_max ~ as.numeric(date), span = 0.25, data=no_na)
oo <- order(sumfile$date)
plot(sumfile$date, sumfile$PAR_2013_max)
lines(sumfile$date[oo], lo13$fitted[oo], col='red', lwd=1)

非常感谢任何帮助确定绘制年度曲线的解决方案。

1 个答案:

答案 0 :(得分:0)

我认为你对那些&#34;幽灵&#34;积分来自。您应该可以省略任何NA值,这里是使用ggplot2的示例:

(defvar my-charset
  (eval-when-compile
    (concat (number-sequence 48 57) (number-sequence 65 90) (number-sequence 97 122)))
  "Char set in terms of number list.")
(defvar my-charset-length
  (eval-when-compile
    (length (concat (number-sequence 48 57) (number-sequence 65 90) (number-sequence 97 122))))
  "Length of my-charset.")

(defun my-generate-string (&optional max-length min-length)
  "Generate a random string."
  (let (string)
    (dotimes (_i (+ (random (- (or max-length 10) (or min-length 5) -1)) (or min-length 5)))
      (push (aref my-charset (random my-charset-length)) string))
    (concat string)))

这为我提供了一个很好的曲线。注意&#34; span&#34;变量通过黄土函数确定平滑程度,因此将其调整为品味,或将其保留为接受默认平滑。

这也适用于每年一次:

plot <- ggplot( data = group, aes( x = Date, y = PAR ) ) +
    geom_point( na.rm = TRUE ) +
    geom_smooth( method = "loess", span = 0.2, na.rm = TRUE )
plot