这看起来很简单,但不知何故无法弄清楚如何解决这个问题。检测组内是否存在另一列中的两个特定字符串值的最佳方法是什么。
示例df:
library(tidyverse)
tribble(
~city, ~var,
"A", "PVDA",
"A", "GL",
"A", "GMBL",
"B", "GL",
"B", "VVD",
"C", "CDA",
"C", "VVD"
)
我想做的是这样的事情:
join_anp_vgn_sf %>%
group_by(city) %>%
filter(grepl("^PVDA$&^GL$", var))
但这不起作用,因为该代码正在查看每个单独的值。
期望的输出:
city var
<chr> <chr>
1 A PVDA
2 A GL
3 A GMBL
答案 0 :(得分:3)
使用dplyr
df <- tribble(
~city, ~var,
"A", "PVDA",
"A", "GL",
"B", "GL",
"B", "VVD",
"C", "CDA",
"C", "VVD"
)
df %>%
group_by(city) %>%
filter(all(c("PVDA","GL") %in% var))
# A tibble: 2 x 2
# Groups: city [1]
# city var
# <chr> <chr>
# 1 A PVDA
# 2 A GL
修改
使用更新的示例
df <- tribble(
~city, ~var,
"A", "PVDA",
"A", "GL",
"A", "GMBL",
"B", "GL",
"B", "VVD",
"C", "CDA",
"C", "VVD"
)
df %>%
group_by(city) %>%
filter(all(c("PVDA","GL") %in% var))
# A tibble: 3 x 2
# Groups: city [1]
# city var
# <chr> <chr>
# 1 A PVDA
# 2 A GL
# 3 A GMBL
答案 1 :(得分:1)
使用grepl
功能查找同时拥有PVDA和PVDA的城市GL值,之后选择原始三角形中的值。
PVDA<-as.character(unlist(df[grepl("^PVDA", df$var),"city"]))
GL<-as.character(unlist(df[grepl("^GL", df$var),"city"]))
df[df$city==PVDA[PVDA %in% GL],]
# A tibble: 2 x 2
city var
<chr> <chr>
1 A PVDA
2 A GL
答案 2 :(得分:1)
如果您愿意,仍然可以使用df %>%
group_by(city) %>%
filter(sum(grepl("PVDA|GL", unique(var))) >= 2)
# A tibble: 2 x 2
# Groups: city [1]
# city var
# <chr> <chr>
#1 A PVDA
#2 A GL
,这样您就可以使用部分字符串匹配:
<强> Dplyr:强>
df[ave(df$var, df$city, FUN = function(x) sum(grepl("PVDA|GL", unique(x))) >= 2) %>% as.logical, ]
基地R:
zones