返回mi包中的完整数据集,没有遗漏指标

时间:2016-01-23 23:48:04

标签: r package packages missing-data imputation

dfmiss=data.frame(x=c(1,4,6,NA,7,NA,9,10,4,3),
       y=c(10,12,NA,NA,14,18,20,15,12,17),
       z=c(225,198,520,147,NA,130,NA,200,NA,99),
       v=c(44,51,74,89,45,55,25,36,75,25))

我使用mi包将这些不完整的数据估算如下:

istall.package("mi")
library(mi)
    mdf <- missing_data.frame(dfmiss) # change dataframe to missing_data.frame
    imp=mi(mdf)
     complete(imp,1)
               x         y         z  v missing_x missing_y missing_z
    1   1.000000 10.000000 225.00000 44     FALSE     FALSE     FALSE
    2   4.000000 12.000000 198.00000 51     FALSE     FALSE     FALSE
    3   6.000000 -2.631072 520.00000 74     FALSE      TRUE     FALSE
    4   9.189989 14.760334 147.00000 89      TRUE      TRUE     FALSE
    5   7.000000 14.000000 188.37644 45     FALSE     FALSE      TRUE
    6  11.127962 18.000000 130.00000 55      TRUE     FALSE     FALSE
    7   9.000000 20.000000  92.30703 25     FALSE     FALSE      TRUE
    8  10.000000 15.000000 200.00000 36     FALSE     FALSE     FALSE
    9   4.000000 12.000000 184.29575 75     FALSE     FALSE      TRUE
    10  3.000000 17.000000  99.00000 25     FALSE     FALSE     FALSE

complete()命令返回完整的数据集,但是我想要返回这个完整的数据集out(列为TRUE / FALSE)[missing_x,missing_y,missing_z]。

1 个答案:

答案 0 :(得分:1)

您可以删除多余的列:

hiddenimports = [
    'ssl',
    'cPickle',
    'pickle',
    'itertools',
    'multiprocessing',
    'builtins',
    'rethinkdb',
    'rethinkdb.ast',
    'rethinkdb.errors',
    'rethinkdb.net',
    'rethinkdb.ql2_pb2',
    'rethinkdb.query',
    'rethinkdb.version',
]