删除R中每列满足条件的行

时间:2019-09-06 21:10:57

标签: r for-loop delete-row

我有一个带有数值的数据框(df)。我想编写一个遍历各列的for循环。对于每列,我希望它计算具有大于一个数字(例如3)的值的行数,然后我希望它在移至下一列之前完全删除这些行。

这是我到目前为止尝试过的:


output <- vector("double", ncol(df))
  for (i in 1:ncol(df)){
  output[[i]] <- length(which(df[i] >= 3))
  df <- df[!df[,i] >= 3, ]
}

但是出现以下错误:

  

矩阵错误(如果(is.null(值))逻辑()否则值,nrow = nr,   dimnames = list(rn,:'dimnames'[2]的长度不等于数组   范围


dput(head(df))

#output:
structure(list(col1 = numeric(0), col2 = numeric(0), (etc.)
NA. = integer(0)), row.names = integer(0), class = "data.frame")

  col1   col2   col3   col4     col5
1 2.09   1.10    0     21.03    0.88
3 0.00   0.00    0     11.71    0.00
4 1.50   1.10    0     1.67     1.76
5 5.10   0.00    0     0.83     17.94
6 0.00   6.34    0     2.10     0.00

在上面的示例中,我感兴趣的最终输出是一个向量,其中每列删除的行数为(1,1,0,2,0)。

2 个答案:

答案 0 :(得分:1)

这是一种for循环的方式-

dummy_df <- df # dummy_df in case you don't want to alter original df
output <- rep(0, ncol(df)) # initialize output

for(i in 1:ncol(df)) {
  if(nrow(dummy_df) == 0) break # loop breaks if all rows are removed
  if(!any(dummy_df >= 3)) break # loop breaks if no values >= 3 remain
  output[i] <- sum(dummy_df[i] >= 3)
  dummy_df <- dummy_df[dummy_df[i] < 3, , drop = F]
}

output
[1] 3 0 1

使用apply的另一种方法可能比上述循环快-

# output excludes columns with 0 rows but can be added later if needed
table(apply(df, 1, function(x) match(TRUE, x >= 3)))
1 3 
3 1

数据(感谢@ Sada93)-

  a  b c
1 1  1 1
2 2  2 5
3 3  3 2
4 4 10 1
5 5  2 1

答案 1 :(得分:1)

您可以这样做:

Data:
df <- data.frame(x=c(1:5,2),y=c(1,1,1,4,5,2), z= c(2,1,1,2,5,2))

代码:

removed.df <- NULL
for (i in 1:ncol(df)){
  for(j in 1:nrow(df)){
    if(df[j,i] > 3){
      tmp.df <- df[j,]
      tmp.df$index <- j
      removed.df <- rbind(removed.df, tmp.df)
    }
  }
}

# removed.df is the rows you have deleted. Index column shows original rows deleted
removed.df <- removed.df[!duplicated(removed.df$index),]

# now you just remove the rows (index of removed.df) from df.
df[-removed.df$index,]

> df[-removed.df$index,]
  x y z
1 1 1 2
2 2 1 1
3 3 1 1
6 2 2 2