对所选变量按行计算缺失变量

时间:2016-01-19 22:06:38

标签: r

给定一个包含6个变量的数据框:

x1 var1 x2 var2 x3 var3

如何计算变量中的缺失值:var1var2var3 BY ROW ,以便数据框具有以下变量:< / p>

x1 var1 x2 var2 x3 var3 num.missing

1 个答案:

答案 0 :(得分:0)

具有预期答案的可重复数据集将非常有用。我会为你创建一个;

set.seed(1337)
dat <- data.frame(x1=1:10, var1=runif(10), 
                  x2=11:20, var2=runif(10), 
                  x3=21:30, var3=runif(10))
dat
   x1       var1 x2       var2 x3       var3
1   1 0.57632155 11 0.97943029 21 0.84916377
2   2 0.56474213 12 0.99371759 22 0.72408821
3   3 0.07399023 13 0.82735873 23 0.04661798
4   4 0.45386562 14 0.19398230 24 0.15367816
5   5 0.37327926 15 0.98132543 25 0.56259417
6   6 0.33131745 16 0.02522857 26 0.98142569
7   7 0.94763002 17 0.97238848 27 0.93177423
8   8 0.28111731 18 0.92379666 28 0.89861494
9   9 0.24540405 19 0.33913968 29 0.46979326
10 10 0.14604362 20 0.24657940 30 0.99500811

删除随机的值样本;

dat[sample(1:10, 3), "var1"] <- NA
dat[sample(1:10, 3), "var2"] <- NA
dat[sample(1:10, 3), "var3"] <- NA
dat
   x1       var1 x2      var2 x3      var3
1   1         NA 11 0.9794303 21 0.8491638
2   2 0.56474213 12 0.9937176 22 0.7240882
3   3 0.07399023 13        NA 23        NA
4   4 0.45386562 14 0.1939823 24 0.1536782
5   5 0.37327926 15 0.9813254 25 0.5625942
6   6         NA 16        NA 26 0.9814257
7   7 0.94763002 17 0.9723885 27        NA
8   8 0.28111731 18        NA 28 0.8986149
9   9         NA 19 0.3391397 29 0.4697933
10 10 0.14604362 20 0.2465794 30        NA

鉴于逻辑等于二进制整数(TRUE==1FALSE==0),我们可以总结is.na()次测试

dat$num.missing <- is.na(dat$var1) + is.na(dat$var2) + is.na(dat$var3)
dat
   x1       var1 x2      var2 x3      var3 num.missing
1   1         NA 11 0.9794303 21 0.8491638           1
2   2 0.56474213 12 0.9937176 22 0.7240882           0
3   3 0.07399023 13        NA 23        NA           2
4   4 0.45386562 14 0.1939823 24 0.1536782           0
5   5 0.37327926 15 0.9813254 25 0.5625942           0
6   6         NA 16        NA 26 0.9814257           2
7   7 0.94763002 17 0.9723885 27        NA           1
8   8 0.28111731 18        NA 28 0.8986149           1
9   9         NA 19 0.3391397 29 0.4697933           1
10 10 0.14604362 20 0.2465794 30        NA           1