我有一个我正在使用的示例数据框
ID <- c("ID001","ID001","ID003","ID003","ID003","ID006","ID007","ID007","ID009","ID010",
"ID021","ID021","ID023","ID023","ID023","ID026","ID027","ID207","ID023")
Type <- c("Length","Length","LengthTest","LengthTest","Length","LengthTest","LengthTest","Length","LengthTest","LengthTest",
"LengthTest","Length","LengthTest","LengthTest","LengthTest","Length","LengthTest","LengthTest","LengthTest")
PassFail <- c("PASS","PASS","PASS","PASS","FAIL","FAIL_AVG","FAIL#PTS","PASS","FAIL","PASS",
"FAIL_SIG","PASS","PASS","FAIL#NODATA","PASS","PASS","FAIL","FAIL#PTS","PASS")
Slot <- c(1.0,1.0,1.1,1.2,2.0,2.1,2.2,1.0,1.1,1.2,
1.3,2.0,2.1,2.2,2.3,3.0,3.1,3.2,3.3)
Num <- c(1111,1112,1112,1112,1113,1113,1113,1114,1114,1114,
1114,1115,1115,1115,1115,1115,1115,1115,1115)
df <- data.frame(ID,Type,PassFail,Slot,Num)
DF
ID Type PassFail Slot Num
ID001 Length PASS 1.0 1111
ID001 Length PASS 1.0 1112
ID003 LengthTest PASS 1.1 1112
ID003 LengthTest PASS 1.2 1112
ID003 Length FAIL 2.0 1113
ID006 LengthTest FAIL_AVG 2.1 1113
ID007 LengthTest FAIL#PTS 2.2 1113
ID007 Length PASS 1.0 1114
ID009 LengthTest FAIL 1.1 1114
ID010 LengthTest PASS 1.2 1114
ID021 LengthTest FAIL_SIG 1.3 1114
ID021 Length PASS 2.0 1115
ID023 LengthTest PASS 2.1 1115
ID023 LengthTest FAIL#NODATA 2.2 1115
ID023 LengthTest PASS 2.3 1115
ID026 Length PASS 3.0 1115
ID027 LengthTest FAIL 3.1 1115
ID207 LengthTest FAIL#PTS 3.2 1115
ID023 LengthTest PASS 3.3 1115
我正在尝试将此数据框减少为仅包含基于特定条件的行。我希望按Num
列对Slot
列的摘要进行分组。
Slot
列通常有整数(1,2,3等),但如果插槽有额外的行(0.1,0.2等),我想每个插槽只返回1行( 1,2,3等)按Num
进行分组,并查看PassFail列是否有任何失败。
如果Passfail
和Slot
的{{1}}列中的所有内容都传递,则返回第一个PASS。
如果广告位的子级别中有任何失败,请返回与Num
分组的Slot
号码对应的第一个失败。
注意:在df中,任何在PASSFAIL列中具有FAIL的内容都被视为失败。
所需输出
Num
我试图通过这样做获得任何有子级别的插槽
ID Type PassFail Slot Num
ID001 Length PASS 1.0 1111
ID001 Length PASS 1.0 1112
ID003 Length FAIL 2.0 1113
ID009 LengthTest FAIL 1.1 1114
ID023 LengthTest FAIL#NODATA 2.2 1115
ID027 LengthTest FAIL 3.1 1115
我不确定这是否是解决此问题的正确方法。有人能指出我正确的方向吗?
答案 0 :(得分:1)
答案 1 :(得分:0)
这接近你想要的输出。
library(data.table)
setDT(df)
df[order(Slot),{mytest=sum(PassFail=='PASS')==.N;.(ID=ID[1],Type=Type[1],PassFail=ifelse(mytest,rep('PASS',length(mytest)),rep('FAIL',length(mytest))),Slot=Slot[1])},by=.(Num,intslot=as.integer(Slot))][order(Num,Slot)]
它忽略了&#34;笔记&#34; PassFail,只是对待任何不是&#34; PASS&#34;作为&#34;失败&#34;