如何在不符合条件时删除行

时间:2016-09-15 16:57:40

标签: r

我一直忙于处理下面的数据

df<- structure(list(V1 = structure(1:6, .Label = c("A", "B", "C", 
"D", "E", "F"), class = "factor"), V2 = structure(1:6, .Label = c("AA", 
"BB", "CC", "DD", "EE", "FF"), class = "factor"), V3 = structure(c(6L, 
5L, 4L, 1L, 3L, 2L), .Label = c("hddu", "jfhu", "jshsg", "kduf", 
"Tlsu", "Trsm"), class = "factor"), V4 = c(1L, 2L, 0L, 0L, 5L, 
6L), V5 = c(0L, 2L, 0L, 4L, 0L, 5L), V6 = c(0L, 0L, 4L, 6L, 0L, 
7L), V7 = c(0L, 0L, 5L, 0L, 0L, 8L), V8 = c(0L, 0L, 0L, 0L, 6L, 
0L), V9 = c(0L, 0L, 0L, 7L, 0L, 0L)), .Names = c("V1", "V2", 
"V3", "V4", "V5", "V6", "V7", "V8", "V9"), class = "data.frame", row.names = c(NA, 
-6L))

看起来像这样

  V1 V2    V3 V4 V5 V6 V7 V8 V9
1  A AA  Trsm  1  0  0  0  0  0
2  B BB  Tlsu  2  2  0  0  0  0
3  C CC  kduf  0  0  4  5  0  0
4  D DD  hddu  0  4  6  0  0  7
5  E EE jshsg  5  0  0  0  6  0
6  F FF  jfhu  6  5  7  8  0  0

我想要的是连续删除至少2列没有值的行。例如,它们应该具有前2列,或后2列或第3列,如果它们有更多,则可以。 我想检测它们,并在没有它们的情况下进行输出 在这种情况下

第1,4和5行。所以我需要两个输出

1-索引1,4和5(显示删除了哪些行) 2-预期输出就像这样

B   BB  Tlsu    2   2   0   0   0   0
C   CC  kduf    0   0   4   5   0   0
F   FF  jfhu    6   5   7   8   0   0

2 个答案:

答案 0 :(得分:1)

您可以手动选取索引的两个数据框,这些数据框水平移动一个并使用向量化的&来查找是否有任何连续的TRUE,并使用rowSums作为过滤索引收集行方式条件:

df[rowSums(df[4:8] & df[5:9]) != 0, ]

#   V1 V2   V3 V4 V5 V6 V7 V8 V9
# 2  B BB Tlsu  2  2  0  0  0  0
# 3  C CC kduf  0  0  4  5  0  0
# 4  D DD hddu  0  4  6  0  0  7
# 6  F FF jfhu  6  5  7  8  0  0

如果列必须每隔一列配对,seq可用于生成必要的索引:

df[rowSums(df[seq(4, 9, 2)] & df[seq(5, 9, 2)]) != 0, ]

#  V1 V2   V3 V4 V5 V6 V7 V8 V9
#2  B BB Tlsu  2  2  0  0  0  0
#3  C CC kduf  0  0  4  5  0  0
#6  F FF jfhu  6  5  7  8  0  0

答案 1 :(得分:1)

逻辑不明确。但是,这似乎有效

 df[Reduce(`|`, Map(`&`, df[-(1:3)][c(TRUE, FALSE)], df[-(1:3)][c(FALSE, TRUE)])),]
 #  V1 V2   V3 V4 V5 V6 V7 V8 V9
 #2  B BB Tlsu  2  2  0  0  0  0
 #3  C CC kduf  0  0  4  5  0  0
 #6  F FF jfhu  6  5  7  8  0  0