Question

在以下数据框中，我想过滤包含人物的组＆＃34; a＆＃34;，＆＃34; b＆＃34;和＆＃34; c＆＃34;：

df <- structure(list(group = c(1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 4), 
person = structure(c(1L, 2L, 1L, 3L, 1L, 2L, 3L, 1L, 1L, 
2L, 3L, 4L), .Label = c("a", "b", "c", "e"), class = "factor")), .Names = 
c("group", 
"person"), row.names = c(NA, -12L), class = "data.frame")

Answer 1

我们可以使用data.table。转换＆＃39; data.frame＆＃39;到＆＃39; data.table＆＃39; （setDT(df)），按＆＃39;分组＆＃39;，我们通过检查all＆＃39; a＆＃39;＆＃39; b＆＃39;来获取逻辑索引。，＆＃39; c＆＃39;元素是%in%＆＃39; person＆＃39;获取Data.table的子集（.SD）

library(data.table)
setDT(df)[, .SD[all(c('a', 'b', 'c') %in% person)], group]

或使用dplyr，在按人员分组后使用相同的方法

df %>%
   group_by(group) %>%
   filter(all(c('a', 'b', 'c') %in% person))

或base R

v1 <- rowSums(table(df)[, c('a', 'b', 'c')]>0)==3
subset(df, group %in% names(v1)[v1])

更新

如果我们只想使用2

返回dplyr组

df %>% 
    group_by(group) %>%
    filter(all(c('a', 'b', 'c') %in% person), all(person %in% c('a', 'b', 'c')))

或n_distinct

df %>%
   group_by(group) %>%
   filter(all(c('a', 'b', 'c') %in% person), n_distinct(person)==3)

或data.table

setDT(df)[, .SD[all(c('a', 'b', 'c') %in% person) & uniqueN(person)==3], group]

如果另一列包含使用R中的dplyr的特定值集，则过滤列

1 个答案:

更新