是否有人使用IRMI(VIM)来估算失踪年龄?玩弄泰坦尼克数据和多个插补包。在调用IRMI时,我希望将填充缺失的值,但它似乎没有做任何事情。我的实施不正确吗?年龄是一个数字变量...
我的代码如下。任何有关为什么不起作用的见解都会非常有用......
谢谢, 西马克
trainData<-read.csv("titanic.csv",header=TRUE)
> summary(trainData$Age)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
0.42 20.12 28.00 29.70 38.00 80.00 177
library(VIM)
> library(VIM)
> countNA(trainData)
[1] 177
> imputed.train <- irmi(trainData,mixed=NULL)
Time difference of -0.08631301 secs
Imputation performed on the following data set:
type #missing
PassengerId "integer" "0"
Survived "binary" "0"
Pclass "integer" "0"
Name "nominal" "0"
Sex "binary" "0"
Age "numeric" "177"
SibSp "integer" "0"
Parch "integer" "0"
Ticket "nominal" "0"
Fare "numeric" "0"
Cabin "nominal" "0"
Embarked "nominal" "0"
IsChild "numeric" "0"
> countNA(imputed.train)
[1] 177
I understand that in this case, using a measure of central tendency will suffice to filling in the missing values, but would like to know why this didnt work.
The dataframe has mixed data types and Age is numeric.
谢谢, 西马克