删除数据框中的行

时间:2016-03-30 14:06:22

标签: r dataframe rows

我有这个数据框:

x <- c(0,55,105,165,270,65,130,155,155,225,250,295,
     30,100,110,135,160,190,230,300,30,70,105,170,
     210,245,300,0,85,175,300,15,60,90,90,140,210,
     260,270,295,5,55,55,90,100,140,190,255,285,270)

y <- c(305,310,305,310,310,260,255,265,285,280,250,
     260,210,240,225,225,225,230,210,215,160,190,
     190,175,160,160,170,120,135,115,110,85,90,90,
     55,55,90,85,50,50,25,30,5,35,15,0,40,20,5,150)

z <- c(870,793,755,690,800,800,730,728,710,780,804,
     855,813,762,765,740,765,760,790,820,855,812,
     773,812,827,805,840,890,820,873,875,873,865,
     841,862,908,855,850,882,910,940,915,890,880,
     870,880,960,890,860,830)

dati5 <- data.frame(x, y, z)

我想删除包含变量x和y的最大值或最小值的数据帧的行。我还想保留这些行,以便以后可以使用它。我怎么能这样做?

PS在这种情况下我要删除包含的所有行:x == 0或x == 300或y == 0或y == 310

3 个答案:

答案 0 :(得分:3)

dati5[!(dati5$x %in% max(dati5$x)),]

这将返回数据框,其中包含x的值与x的最大值匹配的所有行,已删除。

没有否定!的相同表达式会显示已删除的行:

dati5[(dati5$x %in% max(dati5$x)),]
    x   y   z
20 300 215 820
27 300 170 840
31 300 110 875

miny执行相同操作。

修改 正如Laterow所说:这里不需要%in%

dati5[dati5$x != max(dati5$x),]

此外:

鉴于您已将x存储为矢量,通过vectror进行简单比较也可以起作用:

dati5[x == max(x),]

<强> EDIT2:

至于四个单独调用的注释,它们也可以用单个命令完成:

dati5[!(dati5$x %in% c(max(dati5$x), min(dati5$x))) | !(dati5$y %in% c(max(dati5$y), min(dati5$y))),]

删除内容:

dati5[(dati5$x %in% c(max(dati5$x), min(dati5$x))) | (dati5$y %in% c(max(dati5$y), min(dati5$y))),]
     x   y   z
1    0 305 870
2   55 310 793
4  165 310 690
5  270 310 800
20 300 215 820
27 300 170 840
28   0 120 890
31 300 110 875
46 140   0 880

每个x和y的最大/最小

答案 1 :(得分:1)

单线解决方案,可轻松处理任意数量的列:

dati5[!rowSums(sapply(dati5[-3], function(x) x == max(x) | x == min(x))),]

说明:

                                 function(x) x == max(x) | x == min(x)       # Return TRUE if element in vector is max or min
               sapply(dati5[-3],                                      )      # Apply this to dati5 (columns x and y)
       rowSums(                                                        )     # Sum this per row (FALSE = 0, TRUE = 1)
      !                                                                      # Logically negate this (0 = FALSE, above 0 = TRUE)
dati5[                                                                  ,]   # Subset dati5

答案 2 :(得分:0)

这可能有帮助吗?

which_minmax <- function(x) which(x == max(x, na.rm=TRUE) | x == min(x, na.rm=TRUE))
remove_ids <- unique(unlist(sapply(dati5[, 1:2], which_minmax)))
# filtered dati5
dati5[-remove_ids, ]
# removed dati5
dati5[remove_ids, ]

这可以作为一个功能:

remove_minmax <- function(df, cols_to_filter){
  which_minmax <- function(x) which(x == max(x, na.rm=TRUE) | x == min(x, na.rm=TRUE))
  remove_ids <- unique(unlist(sapply(df[, cols_to_filter], which_minmax)))
  list(filtered=df[-remove_ids, ], removed=df[remove_ids, ])
}
# eg
remove_minmax(dati5, 1:2)