使用dplyr过滤行

时间:2017-01-24 15:26:55

标签: r dataframe filter dplyr conditional-statements

我有一个数据框

soDf <- structure(list(State = c("Exception", "Exception", "Exception",  "Exception", "Approval", "Processing"), User = c("1","2", "1", "3", "1", "4"), Voucher.Number = c(10304685L, 10304685L, 10304685L,10304685L, 10304685L, 10304685L),  Queue.Exit.Date = c("8/24/2016 14:59", "8/26/2016 13:25", "8/26/2016 15:56", "8/26/2016 16:13", "8/26/2016 16:25", "8/26/2016 17:34")),.Names = c("State", "User", "Voucher.Number","Queue.Exit.Date"), row.names = 114:119, class = "data.frame")

我有一个规则列表,我希望按以下方式过滤行:

其中一条规则是

(Voucher.Number == lag(Voucher.Number)) & (State == 'Exception' & lag(State) == 'Exception' )

如果当前和滞后凭证编号相等,并且两者都有异常标记,则计数将该行标记为True

当我将此规则应用于其他几个时,它将第4行行返回为True,当它应作为False

返回时
       State User Voucher.Number Queue.Exit.Date toFilt
1  Exception    1       10304685 8/24/2016 14:59     NA
2  Exception    2       10304685 8/26/2016 13:25   TRUE
3  Exception    1       10304685 8/26/2016 15:56   TRUE
4  Exception    3       10304685 8/26/2016 16:13   TRUE
5   Approval    1       10304685 8/26/2016 16:25  FALSE
6 Processing    4       10304685 8/26/2016 17:34  FALSE

以下是我用于所有过滤规则的代码

soDf <- soDf %>%
  arrange(Voucher.Number, Queue.Exit.Date)%>%
  mutate(toFilt =  ((User == lag(User)& Voucher.Number ==lag(Voucher.Number)))|
           ((Voucher.Number != lag(Voucher.Number)) & State == "Exception") |
           ((Voucher.Number == lag(Voucher.Number)) & (State == 'Exception' & lag(State) == 'Exception' ))|
           ((Voucher.Number == lag(Voucher.Number)) & (User == lag(User))))  

1 个答案:

答案 0 :(得分:0)

第5行不符合mutate列中的条件语句。第5行的状态是&#34;批准&#34;而不是&#34;异常&#34;,并且用户ID与滞后用户ID不匹配。

因此,它返回FALSE,因为4个语句都不为TRUE。它似乎不是编码错误,只是条件语句需要更改以满足您的需求。希望这有帮助!