如何仅返回符号上由Num分组的Slot上的特定条件的行

时间:2018-04-02 18:43:05

标签: r dataframe dplyr data.table

我有一个我正在使用的示例数据框

ID <- c("ID001","ID001","ID003","ID003","ID003","ID006","ID007","ID007","ID009","ID010",
        "ID021","ID021","ID023","ID023","ID023","ID026","ID027","ID207","ID023")
Type <- c("Length","Length","LengthTest","LengthTest","Length","LengthTest","LengthTest","Length","LengthTest","LengthTest",
          "LengthTest","Length","LengthTest","LengthTest","LengthTest","Length","LengthTest","LengthTest","LengthTest")
PassFail <- c("PASS","PASS","PASS","PASS","FAIL","FAIL_AVG","FAIL#PTS","PASS","FAIL","PASS",
              "FAIL_SIG","PASS","PASS","FAIL#NODATA","PASS","PASS","FAIL","FAIL#PTS","PASS")
Slot <- c(1.0,1.0,1.1,1.2,2.0,2.1,2.2,1.0,1.1,1.2,
          1.3,2.0,2.1,2.2,2.3,3.0,3.1,3.2,3.3)
Num <- c(1111,1112,1112,1112,1113,1113,1113,1114,1114,1114,
         1114,1115,1115,1115,1115,1115,1115,1115,1115)

df <- data.frame(ID,Type,PassFail,Slot,Num)

DF

      ID       Type    PassFail Slot  Num
   ID001     Length        PASS  1.0 1111
   ID001     Length        PASS  1.0 1112
   ID003 LengthTest        PASS  1.1 1112
   ID003 LengthTest        PASS  1.2 1112
   ID003     Length        FAIL  2.0 1113
   ID006 LengthTest    FAIL_AVG  2.1 1113
   ID007 LengthTest    FAIL#PTS  2.2 1113
   ID007     Length        PASS  1.0 1114
   ID009 LengthTest        FAIL  1.1 1114
   ID010 LengthTest        PASS  1.2 1114
   ID021 LengthTest    FAIL_SIG  1.3 1114
   ID021     Length        PASS  2.0 1115
   ID023 LengthTest        PASS  2.1 1115
   ID023 LengthTest FAIL#NODATA  2.2 1115
   ID023 LengthTest        PASS  2.3 1115
   ID026     Length        PASS  3.0 1115
   ID027 LengthTest        FAIL  3.1 1115
   ID207 LengthTest    FAIL#PTS  3.2 1115
   ID023 LengthTest        PASS  3.3 1115

我正在尝试将此数据框减少为仅包含基于特定条件的行。我希望按Num列对Slot列的摘要进行分组。

Slot列通常有整数(1,2,3等),但如果插槽有额外的行(0.1,0.2等),我想每个插槽只返回1行( 1,2,3等)按Num进行分组,并查看PassFail列是否有任何失败。 如果PassfailSlot的{​​{1}}列中的所有内容都传递,则返回第一个PASS。 如果广告位的子级别中有任何失败,请返回与Num分组的Slot号码对应的第一个失败。

注意:在df中,任何在PASSFAIL列中具有FAIL的内容都被视为失败。

所需输出

Num

我试图通过这样做获得任何有子级别的插槽

     ID       Type    PassFail Slot  Num
  ID001     Length        PASS  1.0 1111
  ID001     Length        PASS  1.0 1112
  ID003     Length        FAIL  2.0 1113
  ID009 LengthTest        FAIL  1.1 1114
  ID023 LengthTest FAIL#NODATA  2.2 1115
  ID027 LengthTest        FAIL  3.1 1115

我不确定这是否是解决此问题的正确方法。有人能指出我正确的方向吗?

2 个答案:

答案 0 :(得分:1)

使用MyDataState

tidyverse解决方案
case_when

reprex package(v0.2.0)创建于2018-04-02。

答案 1 :(得分:0)

这接近你想要的输出。

library(data.table)
setDT(df)
df[order(Slot),{mytest=sum(PassFail=='PASS')==.N;.(ID=ID[1],Type=Type[1],PassFail=ifelse(mytest,rep('PASS',length(mytest)),rep('FAIL',length(mytest))),Slot=Slot[1])},by=.(Num,intslot=as.integer(Slot))][order(Num,Slot)]

它忽略了&#34;笔记&#34; PassFail,只是对待任何不是&#34; PASS&#34;作为&#34;失败&#34;