Question

鉴于data.table，

library(data.table)    
dt <- data.table(Year=c(rep(2014,1,8), 2015, 2014, 2014), no=c(111,111,111,222,222,333,333,444,555,666,666), type=c('a','b','c','a','a','a','f','a', 'a', 'c','f'))

返回，

    Year  no type
 1: 2014 111    a
 2: 2014 111    b
 3: 2014 111    c
 4: 2014 222    a
 5: 2014 222    a
 6: 2014 333    a
 7: 2014 333    f
 8: 2014 444    a
 9: 2015 555    a
10: 2014 666    c
11: 2014 666    f

我想过滤掉任何不包含＆＃39; a＆＃39;和其他人（＆＃39; b＆＃39;，＆＃39; c＆＃39;等）。这意味着，将过滤掉id 222,444和666。请注意，由于2015年，no 555已被过滤掉。

我期望的回报是

no

然后，我们使用Year no type 1: 2014 111 a 2: 2014 111 b 3: 2014 111 c 4: 2014 333 a 5: 2014 333 f最终得到unique 111和333作为我们的最终结果。

我尝试了以下内容：

no

但是，我认为此代码效率不高。你能给我一个建议吗？

Answer 1

这个怎么样：

dt[Year == 2014, if("a" %in% type & uniqueN(type) > 1) .SD, by = no]
#    no Year type
#1: 111 2014    a
#2: 111 2014    b
#3: 111 2014    c
#4: 333 2014    a
#5: 333 2014    f

或者，因为您只对唯一no s：

感兴趣

dt[Year == 2014, "a" %in% type & uniqueN(type) > 1, by = no][(V1), no]
#[1] 111 333

如果您的类型列中可能有NA个，您不想将其视为其他值，则可以将其修改为：

dt[Year == 2014, "a" %in% type & uniqueN(na.omit(type)) > 1, by = no][(V1), no]
#[1] 111 333

Answer 2

我们也可以使用any

res <- dt[Year==2014, if(any(type=="a") & any(type!="a")) .SD, no]
res
#    no Year type
#1: 111 2014    a
#2: 111 2014    b
#3: 111 2014    c
#4: 333 2014    a
#5: 333 2014    f

unique(res$no)
#[1] 111 333

dplyr

可以采用相同的方法

library(dplyr)
dt %>%
   group_by(no) %>% 
   filter(any(type=="a") & any(type!="a") & Year==2014)

如何根据其他列选择具有条件的行

2 个答案: