根据特定列的值删除行

时间:2015-08-27 19:42:32

标签: r

我有以下数据集:

 ID <- c(1,2,3,4,5,6,7,8,9,10)
x1 <- c(1.3,    1.4,    NA, NA, 1.4,    -1.0,   NA, 0.3,    0.7,    NA)
x2 <- c(4.6,    2.6,    NA, 4.3,    NA, 5.6,    NA, 3.7,    5.3,    NA)
x3 <- c(-0.9,   5.6,    NA, -1.3,   NA, -3.4,   NA, 0.3,    -2.6,   NA)
x4 <- c(10.5,   NA, NA, 0.1,    -0.5,   NA, NA, 21.5,   2.0,    NA)
x5 <- c(9.5,    -5.0,   NA, -0.7,   3.6,    3.8,    -7.8,   9.8,    -12.2,  NA)
x6 <- c(-10.3,  NA, -4.4,   NA, 12.2,   NA, NA, -4.1,   3.3,    NA)

alldata <- data.frame(ID,x1,x2,x3,x4,x5,x6)

ID  x1  x2  x3  x4  x5  x6
1   1.3 4.6 -0.9    10.5    9.5 -10.3
2   1.4 2.6 5.6 "NA"    -5.0    "NA"
3   "NA"    "NA"    "NA"    "NA"    "NA"    -4.4
4   "NA"    4.3 -1.3    0.1 -0.7    "NA"
5   1.4 "NA"    "NA"    -0.5    3.6 12.2
6   -1.0    5.6 -3.4    "NA"    3.8 "NA"
7   "NA"    "NA"    "NA"    "NA"    -7.8    "NA"
8   0.3 3.7 0.3 21.5    9.8 -4.1
9   0.7 5.3 -2.6    2.0 -12.2   3.3
10  "NA"    "NA"    "NA"    "NA"    "NA"    "NA"

如果x1-x5的值为ALL&#34; NA&#34;我需要删除任何行,我不在乎 是否x6具有值或&#34; NA&#34;。

所以我的数据看起来像是:

ID  x1  x2  x3  x4  x5  x6
1   1.3 4.6 -0.9    10.5    9.5 -10.3
2   1.4 2.6 5.6 "NA"    -5.0    "NA"
4   "NA"    4.3 -1.3    0.1 -0.7    "NA"
5   1.4 "NA"    "NA"    -0.5    3.6 12.2
6   -1.0    5.6 -3.4    "NA"    3.8 "NA"
7   "NA"    "NA"    "NA"    "NA"    -7.8    "NA"
8   0.3 3.7 0.3 21.5    9.8 -4.1
9   0.7 5.3 -2.6    2.0 -12.2   3.3

3 个答案:

答案 0 :(得分:3)

你可以这样做:

alldata[2:6]

将这一点分开:

subset(alldata, select = x1:x5)

获取您关心的x1到x5列。 (更好的做法可能是!is.na(alldata[2:6]) ,这样你就不依赖于精确的列索引。然后

rowSums(!is.na(alldata[2:6]))

给出一个TRUE / FALSE矩阵,显示哪些不是NA,

rowSums(!is.na(alldata[2:6])) > 0

告诉您每行中有多少项不是NA,

alldata[rowSums(!is.na(alldata[2:6])) > 0, ]

告诉您哪些行至少有一个非NA项目,

{{1}}

仅过滤那些行。

答案 1 :(得分:0)

基于您的数据的基础R解决方案(阅读我的评论)。使用真实 alldata[!rowSums(alldata[2:6] == "NA") == 5, ] ID x1 x2 x3 x4 x5 x6 1 1 1.3 4.6 -0.9 10.5 9.5 -10.3 2 2 1.4 2.6 5.6 NA -5 NA 4 4 NA 4.3 -1.3 0.1 -0.7 NA 5 5 1.4 NA NA -0.5 3.6 12.2 6 6 -1 5.6 -3.4 NA 3.8 NA 7 7 NA NA NA NA -7.8 NA 8 8 0.3 3.7 0.3 21.5 9.8 -4.1 9 9 0.7 5.3 -2.6 2 -12.2 3.3 ,我认为此解决方案无效。

InputStream in = new ByteArrayInputStream(yourbytearray);
BufferedImage bImageFromConvert = ImageIO.read(in);
ImageIO.write(bImageFromConvert, "jpg", new File("c:/yourimage.jpg"));

答案 2 :(得分:0)

以下是使用rowSums的方法:

首先,我将您的因素NA转换为实际的NA

str(alldata)
'data.frame':   10 obs. of  7 variables:
 $ ID: num  1 2 3 4 5 6 7 8 9 10
 $ x1: Factor w/ 6 levels "-1","0.3","0.7",..: 4 5 NA NA 5 1 NA 2 3 NA
 $ x2: Factor w/ 7 levels "2.6","3.7","4.3",..: 4 1 NA 3 NA 6 NA 2 5 NA
 $ x3: Factor w/ 7 levels "-0.9","-1.3",..: 1 6 NA 2 NA 4 NA 5 3 NA
 $ x4: Factor w/ 6 levels "-0.5","0.1","10.5",..: 3 NA NA 2 1 NA NA 5 4 NA
 $ x5: Factor w/ 9 levels "-0.7","-12.2",..: 7 3 NA 1 5 6 4 8 2 NA
 $ x6: Factor w/ 6 levels "-10.3","-4.1",..: 1 NA 3 NA 4 NA NA 2 5 NA

alldata[alldata=="NA"]=NA


sum(is.na(alldata))
    24

接下来,我将演示如何在所有有意义的变量中提取哪些行具有NA值:

which(rowSums(is.na(alldata[,c("x1","x2","x3","x4","x5")]))==5)
[1]  3 10

最后,我们提取所需的行(那些在所有关键变量中没有NA的行):

 alldata[-which(rowSums(is.na(alldata[,c("x1","x2","x3","x4","x5")]))==5),]
  ID   x1   x2   x3   x4    x5    x6
1  1  1.3  4.6 -0.9 10.5   9.5 -10.3
2  2  1.4  2.6  5.6 <NA>    -5  <NA>
4  4 <NA>  4.3 -1.3  0.1  -0.7  <NA>
5  5  1.4 <NA> <NA> -0.5   3.6  12.2
6  6   -1  5.6 -3.4 <NA>   3.8  <NA>
7  7 <NA> <NA> <NA> <NA>  -7.8  <NA>
8  8  0.3  3.7  0.3 21.5   9.8  -4.1
9  9  0.7  5.3 -2.6    2 -12.2   3.3