R dplyr滤波器需要存储滤波器条件

时间:2017-11-16 06:55:27

标签: r filter dplyr

当我过滤条件由其他过滤器组成的数据框时,它似乎不起作用。但是,如果我将条件存储为变量(示例中为f),则过滤工作正常。有人可以解释为什么会发生这种情况,以及如何制作像示例2这样的东西?我宁愿不将过滤条件存储为变量。

library(dplyr)

# Dummy data set
df <- data.frame(Country = factor(c("Argentina", "Brazil", "Brazil", "Brazil")), 
                 Type = factor(c("A", "A", "B", "C")))

# Only returns Brazil. No problem here.
f <- df %>% 
  group_by(Country) %>% 
  summarise(nTypes = n_distinct(Type)) %>% 
  filter(nTypes==3) %>% 
  select(Country) %>% 
  droplevels() %>% 
  unlist()
# > f
#   Country 
# Brazil 
# Levels: Brazil


# Example 1 - Only returns rows of df where Country=="Brazil". No problem here.
df %>% filter(
  Country %in% (f
                )
  )
#   Country Type
# 1  Brazil    A
# 2  Brazil    B
# 3  Brazil    C


# Example 2 - Filter is equivalent to `f` but returns all rows of df, not just Brazil. No idea why!
df %>% filter(
  Country %in% (df %>% 
                  group_by(Country) %>% 
                  summarise(nTypes = n_distinct(Type)) %>% 
                  filter(nTypes==3) %>% 
                  select(Country) %>% 
                  droplevels() %>% 
                  unlist()
                )
  )
#     Country Type
# 1 Argentina    A
# 2    Brazil    A
# 3    Brazil    B
# 4    Brazil    C

1 个答案:

答案 0 :(得分:1)

虽然我不确定为什么会得到意想不到的结果,但基于以下答案:Using filter inside filter in dplyr gives unexpected resultsfilter之后获得所需结果的方法是使用inner_join < / p>

df %>% 
  group_by(Country) %>% 
  summarise(nTypes = n_distinct(Type)) %>% 
  filter(nTypes==3) %>% 
  select(Country) %>% inner_join(.,df)

输出:

Joining, by = "Country"
# A tibble: 3 x 2
  Country   Type
   <fctr> <fctr>
1  Brazil      A
2  Brazil      B
3  Brazil      C