如何从数据表中有条件地删除行?
例如,我有:
Apple, 2001
Apple, 2002
Apple, 2003
Apple, 2004
Banana, 2001
Banana, 2002
Banana, 2003
Candy, 2001
Candy, 2002
Candy, 2003
Candy, 2004
Dog, 2001
Dog, 2002
Dog, 2004
Water, 2002
Water, 2003
Water, 2004
然后,我想只包括每组2001-2004的行,即:
Apple, 2001
Apple, 2002
Apple, 2003
Apple, 2004
Candy, 2001
Candy, 2002
Candy, 2003
Candy, 2004
答案 0 :(得分:3)
使用data.table
,检查if
所有2001:2004是否存在%in%
'年'每组' Col1'的列,然后获取Data.table的子集
library(data.table)
setDT(df1)[, if(all(2001:2004 %in% year)) .SD, by = Col1]
# Col1 year
#1: Apple 2001
#2: Apple 2002
#3: Apple 2003
#4: Apple 2004
#5: Candy 2001
#6: Candy 2002
#7: Candy 2003
#8: Candy 2004
df1 <- structure(list(Col1 = c("Apple", "Apple", "Apple", "Apple", "Banana",
"Banana", "Banana", "Candy", "Candy", "Candy", "Candy", "Dog",
"Dog", "Dog", "Water", "Water", "Water"), year = c(2001L, 2002L,
2003L, 2004L, 2001L, 2002L, 2003L, 2001L, 2002L, 2003L, 2004L,
2001L, 2002L, 2004L, 2002L, 2003L, 2004L)), .Names = c("Col1",
"year"), class = "data.frame", row.names = c(NA, -17L))
答案 1 :(得分:2)
使用base R
,我们可以使用ave
来获得所需的结果
df[ave(df$year, df$Col1, FUN = function(x) all(2001:2004 %in% x)) == 1, ]
# Col1 year
#1 Apple 2001
#2 Apple 2002
#3 Apple 2003
#4 Apple 2004
#8 Candy 2001
#9 Candy 2002
#10 Candy 2003
#11 Candy 2004
答案 2 :(得分:2)
dplyr
方法:
library(dplyr) # or library(tidyverse)
df1 %>%
group_by(Col1) %>%
filter(all(2001:2004 %in% year))
. %>% filter(TRUE)
会返回所有行,而. %>% filter(FALSE)
会丢弃所有数据行。
输出:
Source: local data frame [8 x 2]
Groups: Col1 [2]
Col1 year
<chr> <int>
1 Apple 2001
2 Apple 2002
3 Apple 2003
4 Apple 2004
5 Candy 2001
6 Candy 2002
7 Candy 2003
8 Candy 2004