我有一个像这样的数据框
ID <- c("ID001","ID001","ID001","ID001","ID001","ID001","ID001",
"ID002","ID002","ID002","ID002","ID002")
Type <- c("A","A","A","A","A","A","A",
"B","B","B","B","B")
Measurement <- c("Length","Summary","Breadth","Length","Summary","Breadth","Summary",
"Length","Summary","Breadth","Breadth","Summary")
PassFail <- c("PASS","PASS","PASS","FAIL_PTS","FAIL","FAIL_AVG_HI","FAIL",
"PASS","FAIL_PTS","FAIL","FAIL_AVG_LOW","FAIL")
ToolID <- c("SWP","SWP","SWP","ISP","ISP","IKS","IKS",
"PSX","PSX","PSX","PZY","PZY")
df <- data.frame(ID,Type,Measurement,PassFail,ToolID)
df
ID Type Measurement PassFail ToolID
ID001 A Length PASS SWP
ID001 A Summary PASS SWP
ID001 A Breadth PASS SWP
ID001 A Length FAIL_PTS ISP
ID001 A Summary FAIL ISP
ID001 A Breadth FAIL_AVG_HI IKS
ID001 A Summary FAIL IKS
ID002 B Length PASS PSX
ID002 B Summary FAIL_PTS PSX
ID002 B Breadth FAIL PSX
ID002 B Breadth FAIL_AVG_LOW PZY
ID002 B Summary FAIL PZY
我正在尝试使用如下条件对此数据框进行子集化:当passfail =&#39; FAIL_AVG_HI&#39;或者&#39; FAIL_AVG_LOW&#39;,我想删除该组中的行(ID,类型,工具ID)。
我的所需输出看起来像这样
ID Type Measurement PassFail ToolID
ID001 A Length PASS SWP
ID001 A Summary PASS SWP
ID001 A Breadth PASS SWP
ID001 A Length FAIL_PTS ISP
ID001 A Summary FAIL ISP
ID002 B Length PASS PSX
ID002 B Summary FAIL_PTS PSX
ID002 B Breadth FAIL PSX
我正在搞乱分组以删除行。我可以删除具有上述passfail值的行但是如何对它们进行分组并删除属于该组的所有行?
我这样做是为了删除1行
df <- subset(df,df$PassFail != 'FAIL_AVG_HI' | df$PassFail != 'FAIL_AVG_LOW')
答案 0 :(得分:2)
您可以使用group_by %>% filter
:
library(dplyr)
df %>%
group_by(ID, Type, ToolID) %>%
filter(!any(PassFail %in% c('FAIL_AVG_HI', 'FAIL_AVG_LOW')))
#Source: local data frame [8 x 5]
#Groups: ID, Type, ToolID [3]
# ID Type Measurement PassFail ToolID
# <fctr> <fctr> <fctr> <fctr> <fctr>
#1 ID001 A Length PASS SWP
#2 ID001 A Summary PASS SWP
#3 ID001 A Breadth PASS SWP
#4 ID001 A Length FAIL_PTS ISP
#5 ID001 A Summary FAIL ISP
#6 ID002 B Length PASS PSX
#7 ID002 B Summary FAIL_PTS PSX
#8 ID002 B Breadth FAIL PSX
答案 1 :(得分:1)
我们可以使用data.table
library(data.table)
setDT(df)[, if(!any(PassFail %in% c('FAIL_AVG_HI', 'FAIL_AVG_LOW')))
.SD, .(ID, Type, ToolID)]
# ID Type ToolID Measurement PassFail
#1: ID001 A SWP Length PASS
#2: ID001 A SWP Summary PASS
#3: ID001 A SWP Breadth PASS
#4: ID001 A ISP Length FAIL_PTS
#5: ID001 A ISP Summary FAIL
#6: ID002 B PSX Length PASS
#7: ID002 B PSX Summary FAIL_PTS
#8: ID002 B PSX Breadth FAIL