当我过滤条件由其他过滤器组成的数据框时,它似乎不起作用。但是,如果我将条件存储为变量(示例中为f
),则过滤工作正常。有人可以解释为什么会发生这种情况,以及如何制作像示例2这样的东西?我宁愿不将过滤条件存储为变量。
library(dplyr)
# Dummy data set
df <- data.frame(Country = factor(c("Argentina", "Brazil", "Brazil", "Brazil")),
Type = factor(c("A", "A", "B", "C")))
# Only returns Brazil. No problem here.
f <- df %>%
group_by(Country) %>%
summarise(nTypes = n_distinct(Type)) %>%
filter(nTypes==3) %>%
select(Country) %>%
droplevels() %>%
unlist()
# > f
# Country
# Brazil
# Levels: Brazil
# Example 1 - Only returns rows of df where Country=="Brazil". No problem here.
df %>% filter(
Country %in% (f
)
)
# Country Type
# 1 Brazil A
# 2 Brazil B
# 3 Brazil C
# Example 2 - Filter is equivalent to `f` but returns all rows of df, not just Brazil. No idea why!
df %>% filter(
Country %in% (df %>%
group_by(Country) %>%
summarise(nTypes = n_distinct(Type)) %>%
filter(nTypes==3) %>%
select(Country) %>%
droplevels() %>%
unlist()
)
)
# Country Type
# 1 Argentina A
# 2 Brazil A
# 3 Brazil B
# 4 Brazil C
答案 0 :(得分:1)
虽然我不确定为什么会得到意想不到的结果,但基于以下答案:Using filter inside filter in dplyr gives unexpected results在filter
之后获得所需结果的方法是使用inner_join
< / p>
df %>%
group_by(Country) %>%
summarise(nTypes = n_distinct(Type)) %>%
filter(nTypes==3) %>%
select(Country) %>% inner_join(.,df)
输出:
Joining, by = "Country"
# A tibble: 3 x 2
Country Type
<fctr> <fctr>
1 Brazil A
2 Brazil B
3 Brazil C