当它们出现在同一团队的所有方法中时,我想用零值过滤掉行。
例如,在team One
以下的情况下,错误为零,因此需要删除第1,4和7行。
因此,如果在特定alpha=beta=gamma=0
中的team
,则应删除那些零的行。
+----+-------+-------+--------+
| id | team | error | method |
+----+-------+-------+--------+
| 1 | One | 0 | alpha |
| 2 | Two | 5.7 | alpha |
| 3 | Three | 0 | alpha |
| 4 | One | 0 | beta |
| 5 | Two | 0 | beta |
| 6 | Three | 0 | beta |
| 7 | One | 0 | gamma |
| 8 | Two | 0 | gamma |
| 9 | Three | 6.7 | gamma |
+----+-------+-------+--------+
结果表应为:
+----+-------+-------+--------+
| id | team | error | method |
+----+-------+-------+--------+
| 2 | Two | 5.7 | alpha |
| 3 | Three | 0 | alpha |
| 5 | Two | 0 | beta |
| 6 | Three | 0 | beta |
| 8 | Two | 0 | gamma |
| 9 | Three | 6.7 | gamma |
+----+-------+-------+--------+
答案 0 :(得分:2)
假设初始数据帧为df
,则过滤error
组中的任何team
是否为非零:
library(dplyr)
df %>% group_by(team) %>%
filter(any(error!=0))
答案 1 :(得分:1)
按“团队”分组后,我们可以检查逻辑向量(sum
)的error != 0
是否大于0,即至少一个非零元素
library(dplyr)
df %>%
group_by(team) %>%
filter(sum(error !=0 ) > 0)
或与==
df %>%
group_by(team) %>%
filter(sum(error == 0) < n())
df <- structure(list(id = 1:9, team = c("One", "Two", "Three", "One",
"Two", "Three", "One", "Two", "Three"), error = c(0, 5.7, 0,
0, 0, 0, 0, 0, 6.7), method = c("alpha", "alpha", "alpha", "beta",
"beta", "beta", "gamma", "gamma", "gamma")), class = "data.frame",
row.names = c(NA, -9L))
答案 2 :(得分:1)
使用基数r的简短方法:
subset(df, ave(error, team)!=0)
这会过滤出所有平均误差等于零的team
...例如,如果error
可以为负值(例如c(-1, -2, 3)
)。
所以更普遍的情况是
subset(df, !ave(error, team, FUN=function(x) all(x==0)))
..或使用akrun回答中的想法:
subset(df, ave(error %in% 0, team) < 1)