R - 多条件循环和标记

时间:2017-10-04 16:00:19

标签: r dataframe data-modeling

在运行以下示例场景的R代码时需要帮助:

设置1:

Enc     CPT     Paid Status 
23345   97110   Paid    
23345   97140   Non Paid
23349   99396   Paid
23349   36415   Non Paid
23349   99000   Non Paid
23354   99203O  Non Paid
23367   73030   Non Paid
23367   99024   Non Paid
23372   99213O  Paid
23372   36416   Non Paid
23382   81002   Non Paid

设置2:

Main_CPT  Child_CPT
97110     36591
97110     36592
97110     99186
97140     36591
97140     36592
97140     97124
97140     99186
36415     36591
36415     36592
99396     36591
99396     36592
99396     94002
99396     94003
73030     36591
73030     99024
81002     94002
81002     94003

输出:

Enc     CPT     Paid Status    Flag
23345   97110   Paid           Paid Already
23345   97140   Non Paid       Adjusted
23349   99396   Paid           Paid Already
23349   36415   Non Paid       Adjusted
23349   99000   Non Paid       Not Applicable
23354   99203O  Non Paid       Not in Main CPT
23367   73030   Non Paid       Not Paid
23367   99024   Non Paid       Review
23372   99213O  Paid           Not in Main CPT
23372   36416   Non Paid       Not in Main CPT
23382   81002   Non Paid       Not Paid

模型标准:Enc组Wise

  1. 分组编码,如果任何CPT与Main_CPT匹配:

    a)及其付费状态为" P" :Flag ="已付费"

    i)该Enc的剩余CPT。与Child_CPT匹配的组无论如何       付费状态:标志="已调整"

    ii)该Enc的剩余CPT。组与Child_CPT不匹配        与付费状态无关:标志="不适用"

    b)及其付费状态为" NP":Flag ="未支付"

    i)该Enc的剩余CPT。与Child_CPT匹配的组无论如何       付费状态:标志="评论"

    ii)该Enc的剩余CPT。组与Child_CPT不匹配        与付费状态无关:标志="不适用"

  2. 如果该中包含任何CPT。无论条件如何,组都与Main_CPT不匹配:Flag ="不在主CPT中#34;

1 个答案:

答案 0 :(得分:0)

这应该完成工作(使用dplyr):

library(dplyr)
df %>%
  mutate_if(is.factor, as.character) %>%
  group_by(Enc) %>%
  filter(any(CPT %in% lookup$Main_CPT)) %>%
  mutate(Flag = case_when(
    Paid_Status == "Paid" ~ "P",
    (CPT %in% lookup$Child_CPT) & (Paid_Status == "Non_Paid") ~ "A",
    !(CPT %in% lookup$Child_CPT) ~ "NP",
  ))

<强>结果:

# A tibble: 3 x 4
# Groups:   Enc [1]
    Enc   CPT Paid_Status  Flag
  <int> <chr>       <chr> <chr>
1 23349 99396        Paid     P
2 23349 36415    Non_Paid     A
3 23349 99000    Non_Paid    NP

数据:

df = read.table(text = "Enc     CPT     Paid_Status 
23345   97110   Paid    
23345   97140   Non_Paid
23349   99396   Paid
23349   36415   Non_Paid
23349   99000   Non_Paid
23354   99203O  Paid
23367   73030   Paid
23367   99024   Non_Paid
23372   99213O  Paid
23372   36415   Non_Paid
23382   81002   Non_Paid", header = TRUE)

lookup = read.table(text = "Main_CPT  Child_CPT
26010     01810
                    26010     99396
                    26010     0213T
                    26010     0216T
                    99396     64490
                    99396     36415
                    99396     64492
                    99396     64493
                    99396     64494
                    99396     64495
                    26034     64530
                    26034     69990
                    26034     36415
                    26034     99149
                    26034     99150", header = TRUE)

library(dplyr)
lookup = lookup %>% mutate_all(as.character)