在JAGS

时间:2017-12-29 22:28:38

标签: r missing-data bayesian jags

我试图在JAGS中编写最简单的缺失数据模型。 一个预测变量(具有一些缺失的数据点)和一个结果变量。 我知道这个例子不是最有用或最现实的,但是在我继续讨论更复杂的预测数据方案之前,它帮助我解决了模型问题。

模型和数据如下,但这里是编译错误:

  Error in jags.model("MISSING_model.txt", data = dataList, inits = initsList, : 
    RUNTIME ERROR:
  Unable to resolve the following parameters:
  x[3] (line 5)
  x[4] (line 5)
  x[7] (line 5)
  x[13] (line 5)
  x[18] (line 5)
  x[20] (line 5)
  Either supply values for these nodes with the data
  or define them on the left hand side of a relation.

这些是缺失的数据点;我在下面定义它们,所以我不确定错误在哪里。

代码:

# DEFINING THE DATA:
myData <-  matrix (
           c(64.0, 62.3,   NA,   NA, 64.8, 57.5,   NA, 70.2, 63.9, 71.1, 
             66.5, 68.1,   NA, 75.1, 64.6, 69.2, 68.1,   NA, 63.2,   NA, 
             64.1, 71.5, 76.0, 69.7, 73.3, 61.7, 66.4, 65.7, 68.3, 66.9,
            136.4,215.1,173.6,117.3,123.3, 96.5,178.3,191.1,158.0,193.9, 
            127.1,147.9,119.0,204.4,143.4,124.4,140.9,164.7,139.8,110.2, 
            134.1,193.6,180.0,155.0,188.2,187.4,139.2,147.9,178.6,111.1) ,
             nrow=30  )
colnames(myData) <- c("height","weight")
myData <- as.data.frame(myData)

我在这里定义了一个缺失的数据索引:

# this index will help setup priors and let us look at posterior values for missing x's
  mIdx <- ifelse( is.na(myData$height) , 1 , 0)
  mIdx <- sapply( 1:length(mIdx), 
                  function(n) mIdx[n]*sum(mIdx[1:n]))
    # result: mIdx = 
      #              0, 0, 1, 2, 0, 0, 3, 0, 0, 0, 
      #              0, 0, 4, 0, 0, 0, 0, 5, 0, 6, 
      #              0, 0, 0, 0, 0, 0, 0, 0, 0, 0

# add missing index to myData
  myData$mIdx <- mIdx

这里是数据准备部分

# DATA PREP:
  y = myData[,"weight"]
  x = myData[,"height"]
  meanY = mean(y,na.rm=TRUE) 
  meanX = mean(x,na.rm=TRUE) 
  sdY = sd(y,na.rm=TRUE)     
  sdX = sd(x,na.rm=TRUE)
  mIdx = myData[,"mIdx"]
  Ntotal = length(y)
# Specify the data list for JAGS
    dataList = list(
    x = x ,
    y = y ,
    mIdx = mIdx , 
    meanY = meanY ,
    meanX = meanX ,
    sdY = sdY ,
    sdX = sdX ,
    Ntotal = Ntotal
)

这是模特

# THE MODEL
# Standardize the data:
data {
  for ( i in 1:Ntotal ) {
      zx[i] <- ifelse ( mIdx[i]==0, ( x[i] - meanX ) / sdX , x[i] ) # skips NA's
      zy[i] <- ( y[i] - meanY ) / sdY
    }
}
# Specify the model for standardized data:
model {
  for ( i in 1:Ntotal ) {
    zy[i] ~ dt( zbeta0 + zbeta1 * zx[i] , 1/zsigma^2 , nu )
  }
# prior for imputing missing zx's
  zx ~ dnorm( 0 , 1 )
# Priors vague on standardized scale:
  zbeta0 ~ dnorm( 0 , 1/(10)^2 )  
  zbeta1 ~ dnorm( 0 , 1/(10)^2 )
  zsigma ~ dunif( 1.0E-3 , 1.0E+3 )
  nu ~ dexp(1/30.0)
# Transform back to original scale:
  beta1 <- zbeta1 * ysd / xsd  
  beta0 <- zbeta0 * ysd  + ym - zbeta1 * xm * ysd / xsd 
  sigma <- zsigma * ysd
  x <- zx*sdX + meanX
}

最后是MCMC链的初始值:

# INITIALIZE VALUES
 # values hardcoded for simplicity
 beta0 = 0   ;  zbeta0 = 0
 beta1 = 3.6 ;  zbeta1 = 0.5
 sigma = 30  ;  zsigma = 1
 nu = 30
# initial values for missing x data:
  xInit = rep( NA , length(x) )
  xInit[3] <- 68 ; xInit[4]<- 64 ; xInit[7] <- 68
  xInit[13] <- 64 ; xInit[18] <- 68 ; xInit[20] <- 64
initsList = list( beta0=beta0 ,  beta1=beta1 , zbeta0=zbeta0 , zbeta1=zbeta1 , 
                zsigma=zsigma , sigma=sigma, nu = nu , x=xInit )

和jags打电话:

jagsModel = jags.model( "MISSING_model.txt" , data=dataList , inits=initsList , 
                      n.chains=nChains , n.adapt=adaptSteps )

错误似乎是由于错误地设置了缺失数据的先验:

# imputed prior for missing zx's
  zx ~ dnorm( 0 , 1 )

我读到当存在NA时,jags会自动在数据和之前的数据之间切换,但我不确定我的代码在哪里出错了zx&#39 >

感谢您的提示和帮助。

0 个答案:

没有答案