当因子的所有级别都显示为零时,删除列上具有零值的行

时间:2018-12-19 20:10:31

标签: r dataframe filter

当它们出现在同一团队的所有方法中时,我想用零值过滤掉行。

例如,在team One以下的情况下,错误为零,因此需要删除第1,4和7行。 因此,如果在特定alpha=beta=gamma=0中的team,则应删除那些零的行。

+----+-------+-------+--------+
| id | team  | error | method |
+----+-------+-------+--------+
|  1 | One   |     0 | alpha  |
|  2 | Two   |   5.7 | alpha  |
|  3 | Three |     0 | alpha  |
|  4 | One   |     0 | beta   |
|  5 | Two   |     0 | beta   |
|  6 | Three |     0 | beta   |
|  7 | One   |     0 | gamma  |
|  8 | Two   |     0 | gamma  |
|  9 | Three |   6.7 | gamma  |
+----+-------+-------+--------+

结果表应为:

+----+-------+-------+--------+
| id | team  | error | method |
+----+-------+-------+--------+
|  2 | Two   |   5.7 | alpha  |
|  3 | Three |     0 | alpha  |
|  5 | Two   |     0 | beta   |
|  6 | Three |     0 | beta   |
|  8 | Two   |     0 | gamma  |
|  9 | Three |   6.7 | gamma  |
+----+-------+-------+--------+

3 个答案:

答案 0 :(得分:2)

假设初始数据帧为df,则过滤error组中的任何team是否为非零:

library(dplyr)
df %>% group_by(team) %>% 
       filter(any(error!=0))

答案 1 :(得分:1)

按“团队”分组后,我们可以检查逻辑向量(sum)的error != 0是否大于0,即至少一个非零元素

library(dplyr)
df %>% 
   group_by(team) %>% 
   filter(sum(error !=0 ) > 0)

或与==

一起使用逻辑
df %>%
   group_by(team) %>%
   filter(sum(error == 0) < n())

数据

df <- structure(list(id = 1:9, team = c("One", "Two", "Three", "One", 
 "Two", "Three", "One", "Two", "Three"), error = c(0, 5.7, 0, 
 0, 0, 0, 0, 0, 6.7), method = c("alpha", "alpha", "alpha", "beta", 
 "beta", "beta", "gamma", "gamma", "gamma")), class = "data.frame", 
 row.names = c(NA, -9L))

答案 2 :(得分:1)

使用基数r的简短方法:

subset(df, ave(error, team)!=0)

这会过滤出所有平均误差等于零的team ...例如,如果error可以为负值(例如c(-1, -2, 3) )。

所以更普遍的情况是

subset(df, !ave(error, team, FUN=function(x) all(x==0)))

..或使用akrun回答中的想法:

subset(df, ave(error %in% 0, team) < 1)