将R数据帧转换为鼠标mids:数据已更改

时间:2017-03-31 23:22:17

标签: r r-mice

当我将R数据帧转换为鼠标mids,然后将鼠标mids反向转换为新的R数据帧时,它产生了不同的数据集:

#Built-in package used is epi, from the psych package
HSIY2014 <- epi
summary(HSIY2014)

#Impute using mice
set.seed(20170327) 
predictormatrix <- quickpred(HSIY2014,exclude="id",method="spearman",mincor=0.1,minpuc=0.5)
HSIY2014_imp <- mice(HSIY2014,m=5,predictorMatrix=predictormatrix) 
HSIY2014_ImputedData<-complete(HSIY2014_imp,action="long",include=TRUE)

# Run this function which updates the bug in mice::as.mids()
# Source: https://stats.stackexchange.com/questions/73562/analyzing-multiply-imupted-data-from-amelia-in-r-why-do-results-from-zelig-and
as.mids2 <- function(Data2, .imp=1, .id=2){
  ini <- mice(Data2[Data2[, .imp] == 0, -c(.imp, .id)], m = 
  max(as.numeric(Data2[, .imp])), maxit=0)
  names  <- names(ini$imp)
  if (!is.null(.id)){
     rownames(ini$data) <- Data2[Data2[, .imp] == 0, .id]
  }
  for (i in 1:length(names)){
    for(m in 1:(max(as.numeric(Data2[, .imp])))){
      if(!is.null(ini$imp[[i]])){
         indic <- Data2[, .imp] == m & is.na(Data2[Data2[, .imp]==0, names[i]])
         ini$imp[[names[i]]][m] <- Data2[indic, names[i]]
      }
    } 
  }
  return(ini)
}

# convert dataframe to mids object using as.mids
HSIY2014_mids<-as.mids2(HSIY2014_ImputedData)  

# and back to a data frame, so we can compare the two datasets and make sure nothing went wrong 
HSIY2014_imp_R_2 <- complete(HSIY2014_mids, action="long",include=TRUE)

# Get a summary before mids
table(HSIY2014_ImputedData$V53)
#    1     2 
# 8830 12496 

# Get a summary after mids
table(HSIY2014_imp_R_2$V53)
#    1     2 
#10273 14529 

#Inconsistency found between these datasets

as.emids2函数显然有问题。这个问题有方法解决吗??非常感谢。

0 个答案:

没有答案