我有这个数据框:
x <- c(0,55,105,165,270,65,130,155,155,225,250,295,
30,100,110,135,160,190,230,300,30,70,105,170,
210,245,300,0,85,175,300,15,60,90,90,140,210,
260,270,295,5,55,55,90,100,140,190,255,285,270)
y <- c(305,310,305,310,310,260,255,265,285,280,250,
260,210,240,225,225,225,230,210,215,160,190,
190,175,160,160,170,120,135,115,110,85,90,90,
55,55,90,85,50,50,25,30,5,35,15,0,40,20,5,150)
z <- c(870,793,755,690,800,800,730,728,710,780,804,
855,813,762,765,740,765,760,790,820,855,812,
773,812,827,805,840,890,820,873,875,873,865,
841,862,908,855,850,882,910,940,915,890,880,
870,880,960,890,860,830)
dati5 <- data.frame(x, y, z)
我想删除包含变量x和y的最大值或最小值的数据帧的行。我还想保留这些行,以便以后可以使用它。我怎么能这样做?
PS在这种情况下我要删除包含的所有行:x == 0或x == 300或y == 0或y == 310
答案 0 :(得分:3)
dati5[!(dati5$x %in% max(dati5$x)),]
这将返回数据框,其中包含x
的值与x
的最大值匹配的所有行,已删除。
没有否定!
的相同表达式会显示已删除的行:
dati5[(dati5$x %in% max(dati5$x)),]
x y z
20 300 215 820
27 300 170 840
31 300 110 875
对min
和y
执行相同操作。
修改强>
正如Laterow所说:这里不需要%in%
。
dati5[dati5$x != max(dati5$x),]
此外:
鉴于您已将x存储为矢量,通过vectror进行简单比较也可以起作用:
dati5[x == max(x),]
<强> EDIT2:强>
至于四个单独调用的注释,它们也可以用单个命令完成:
dati5[!(dati5$x %in% c(max(dati5$x), min(dati5$x))) | !(dati5$y %in% c(max(dati5$y), min(dati5$y))),]
删除内容:
dati5[(dati5$x %in% c(max(dati5$x), min(dati5$x))) | (dati5$y %in% c(max(dati5$y), min(dati5$y))),]
x y z
1 0 305 870
2 55 310 793
4 165 310 690
5 270 310 800
20 300 215 820
27 300 170 840
28 0 120 890
31 300 110 875
46 140 0 880
每个x和y的最大/最小
答案 1 :(得分:1)
单线解决方案,可轻松处理任意数量的列:
dati5[!rowSums(sapply(dati5[-3], function(x) x == max(x) | x == min(x))),]
说明:
function(x) x == max(x) | x == min(x) # Return TRUE if element in vector is max or min
sapply(dati5[-3], ) # Apply this to dati5 (columns x and y)
rowSums( ) # Sum this per row (FALSE = 0, TRUE = 1)
! # Logically negate this (0 = FALSE, above 0 = TRUE)
dati5[ ,] # Subset dati5
答案 2 :(得分:0)
which_minmax <- function(x) which(x == max(x, na.rm=TRUE) | x == min(x, na.rm=TRUE))
remove_ids <- unique(unlist(sapply(dati5[, 1:2], which_minmax)))
# filtered dati5
dati5[-remove_ids, ]
# removed dati5
dati5[remove_ids, ]
这可以作为一个功能:
remove_minmax <- function(df, cols_to_filter){
which_minmax <- function(x) which(x == max(x, na.rm=TRUE) | x == min(x, na.rm=TRUE))
remove_ids <- unique(unlist(sapply(df[, cols_to_filter], which_minmax)))
list(filtered=df[-remove_ids, ], removed=df[remove_ids, ])
}
# eg
remove_minmax(dati5, 1:2)