分段包中的错误:断点混淆

时间:2012-08-27 01:42:17

标签: r linear-regression

使用分段包创建分段线性回归我在尝试设置自己的断点时看到错误;似乎只有当我尝试设置两个以上时。

(编辑)以下是我正在使用的代码:

# data
bullard <- structure(list(Rt = c(0, 4.0054, 25.1858, 27.9998, 35.7259, 39.0769, 
45.1805, 45.6717, 48.3419, 51.5661, 64.1578, 66.828, 111.1613, 
114.2518, 121.8681, 146.0591, 148.8134, 164.6219, 176.522, 177.9578, 
180.8773, 187.1846, 210.5131, 211.483, 230.2598, 262.3549, 266.2318, 
303.3181, 329.4067, 335.0262, 337.8323, 343.1142, 352.2322, 367.8386, 
380.09, 388.5412, 390.4162, 395.6409), Tem = c(15.248, 15.4523, 
16.0761, 16.2013, 16.5914, 16.8777, 17.3545, 17.3877, 17.5307, 
17.7079, 18.4177, 18.575, 19.8261, 19.9731, 20.4074, 21.2622, 
21.4117, 22.1776, 23.4835, 23.6738, 23.9973, 24.4976, 25.7585, 
26.0231, 28.5495, 30.8602, 31.3067, 37.3183, 39.2858, 39.4731, 
39.6756, 39.9271, 40.6634, 42.3641, 43.9158, 44.1891, 44.3563, 
44.5837)), .Names = c("Rt", "Tem"), class = "data.frame", row.names = c(NA, 
-38L))

library(segmented)

# create a linear model
out.lm <- lm(Tem ~ Rt, data=bullard)

o<-segmented(out.lm, seg.Z=~Rt, psi=list(Rt=c(200,300)), control=seg.control(display=FALSE))

使用psi选项,我尝试了以下内容:

psi = list(x = c(150, 300)) -- OK
psi = list(x = c(100, 200)) -- OK
psi = list(x = c(200, 300)) -- OK
psi = list(x = c(100, 300)) -- OK
psi = list(x = c(120, 150, 300)) -- error 1 below
psi = list(x = c(120, 300)) -- OK
psi = list(x = c(120, 150)) -- OK
psi = list(x = c(150, 300)) -- OK
psi = list(x = c(100, 200, 300)) -- error 2 below

(1)Error in segmented.lm(out.lm, seg.Z = ~Rt, psi = list(Rt = c(120, 150, : only 1 datum in an interval: breakpoint(s) at the boundary or too close

(2)Error in diag(Cov[id, id]) : subscript out of bounds

我已经列出了我的数据at this question,但作为指南,x数据的限制大约是0--400。

与此相关的第二个问题是:如何使用此分段包实际修复断点?

2 个答案:

答案 0 :(得分:6)

这里的问题似乎是segmented包中的错误陷阱。查看segmented.lm的代码可以进行一些调试。例如,在psi = list(x = c(100, 200, 300))的情况下,拟合增强线性模型,如下所示:

lm(formula = Tem ~ Rt + U1.Rt + U2.Rt + U3.Rt + psi1.Rt + psi2.Rt + 
    psi3.Rt, data = mf)

Call:
lm(formula = Tem ~ Rt + U1.Rt + U2.Rt + U3.Rt + psi1.Rt + psi2.Rt + 
    psi3.Rt, data = mf)

Coefficients:
(Intercept)           Rt        U1.Rt        U2.Rt        U3.Rt      psi1.Rt        
   15.34303      0.04149      0.04591    742.74186   -742.74499      1.02252       
   psi2.Rt      psi3.Rt  
        NA           NA  

如您所见,拟合具有NA值,然后导致退化方差 - 协方差矩阵(在代码中称为Cov)。该函数不检查此函数并尝试从Cov中提取对角条目,并失败并显示错误消息。至少第一个错误虽然可能没有太大帮助,却被函数本身捕获,并表明断点太接近了。

如果函数中没有更好的错误捕获,我认为您所能做的就是采用试错法(并避免过于接近的断点)。例如,psi = list(x = c(50, 200, 300))似乎正常。

答案 1 :(得分:3)

如果您使用whiletryCatch,您可以使命令重复,直到它确定模型@jaySf中没有错误。我猜这是由函数中的随机函数设置决定的,可以在seg.control中看到。

lm.model <- lm(xdat ~ ydat, data = x)
if.false <- F
while(if.false == F){
  tryCatch({
    s <- segmented(lm.model, seg.Z =~ydata, psi = NA)
    if.false <- T
  }, error = function(e){
  }, finally = {})
}