如何对我的mouse()合并结果使用预测功能?

时间:2018-10-09 05:11:27

标签: r predict r-mice

嗨,我刚刚开始将R用作学校模块的一部分。我有一个缺少数据的数据集,并且使用mice()来估算丢失的数据。我现在正尝试在合并结果中使用预测功能。但是,我发现了以下错误:

UseMethod(“ predict”)中的错误:   没有将适用于“预测”的方法应用于类“ c('mipo','data.frame')”的对象

我在下面包含了我的整个代码,如果大家都能帮助新手,我将不胜感激。谢谢!

```{r}
library(magrittr)
library(dplyr)
train = read.csv("Train_Data.csv", na.strings=c("","NA"))
test = read.csv("Test_Data.csv", na.strings=c("","NA"))
cols <- c("naCardiac", "naFoodNutrition", "naGenitourinary", "naGastrointestinal", "naMusculoskeletal", "naNeurological", "naPeripheralVascular", "naPain", "naRespiratory", "naSkin")
train %<>%
       mutate_each_(funs(factor(.)),cols)
test %<>%
       mutate_each_(funs(factor(.)),cols)
str(train)
str(test)
```

```{r}
library(mice)
md.pattern(train)
```

```{r}
miTrain = mice(train, m = 5, maxit = 50, meth = "pmm")
```

```{r}
model = with(miTrain, lm(LOS ~ Age + Gender + Race + Temperature + RespirationRate + HeartRate + SystolicBP + DiastolicBP + MeanArterialBP + CVP + Braden + SpO2 + FiO2 + PO2_POCT + Haemoglobin + NumWBC + Haematocrit + NumPlatelets + ProthrombinTime + SerumAlbumin + SerumChloride + SerumPotassium + SerumSodium + SerumLactate + TotalBilirubin + ArterialpH + ArterialpO2 + ArterialpCO2 + ArterialSaO2 + Creatinine + Urea + GCS + naCardiac + GCS + naCardiac + naFoodNutrition + naGenitourinary + naGastrointestinal + naMusculoskeletal + naNeurological + naPeripheralVascular + naPain + naRespiratory + naSkin))
model
summary(model)
```

```{r}
modelResults = pool(model)
modelResults
```

```{r}
pred = predict(modelResults, newdata = test)
PredTest = data.frame(test$PatientID, modelResults)
str(PredTest)
summary(PredTest)
```

1 个答案:

答案 0 :(得分:0)

一种实现此目的的小技巧可能是采用fit()创建的拟合模型之一,并将存储的系数替换为最终合并的估计。我还没有做详细的测试,但似乎可以在这个简单的示例中进行工作:

library(mice)

imp <- mice(nhanes, maxit = 2, m = 2)
fit <- with(data = imp, exp = lm(bmi ~ hyp + chl))
pooled <- pool(fit)

# Copy one of the fitted lm models fit to
#   one of the imputed datasets
pooled_lm = fit$analyses[[1]]
# Replace the fitted coefficients with the pooled
#   estimates (need to check they are replaced in
#   the correct order)
pooled_lm$coefficients = summary(pooled)$estimate

# Predict - predictions seem to match the
#   pooled coefficients rather than the original
#   lm that was copied
predict(fit$analyses[[1]], newdata = nhanes)
predict(pooled_lm, newdata = nhanes)

据我所知predict()的线性回归应该只取决于 在系数上,所以您不必替换其他 在拟合模型中存储值(但如果应用则必须 predict()以外的方法。