填写r中数据框中两个日期之间的间隙

时间:2017-11-15 12:11:30

标签: r dataframe dplyr

我想在组中的两个值之间填入“NAs”,其中状态为第一个值。该组是“标签”字段,我想在“状态”字段中填写“NA”。

这就是我所拥有的:

     datetime                                      title    label   status           option_title
1  2016-08-06                            Pursuit status:      AIG   Active1                   <NA>
2  2016-08-06 What is the current stage of this Pursuit?      AIG     <NA> 1 - Opportunity Review
3  2016-08-31 What is the current stage of this Pursuit?      AIG     <NA>    2 - Solution Review
4  2016-12-13 What is the current stage of this Pursuit?      AIG     <NA>    4 - Submit Proposal
5  2016-11-14                            Pursuit status:  Allianz   Active1                   <NA>
6  2016-10-27 What is the current stage of this Pursuit?  Allianz     <NA>      Pre-Qualification
7  2017-05-18 What is the current stage of this Pursuit?  Allianz     <NA> 1 - Opportunity Review
8  2017-05-18 What is the current stage of this Pursuit?  Allianz     <NA>      Pre-Qualification
9  2017-08-24                            Pursuit status:  Allianz Inactive1                   <NA>
10 2016-10-27 What is the current stage of this Pursuit?  Allianz     <NA>      Pre-Qualification
11 2016-11-14                            Pursuit status:  Allianz   Active2                   <NA>
12 2016-12-19 What is the current stage of this Pursuit?  Allianz     <NA> 1 - Opportunity Review
13 2017-04-14 What is the current stage of this Pursuit?  Allianz     <NA>    2 - Solution Review

这就是我想要的:

         datetime                                      title    label   status           option_title
1  2016-08-06                            Pursuit status:      AIG   Active1                   <NA>
2  2016-08-06 What is the current stage of this Pursuit?      AIG   Active1 1 - Opportunity Review    
3  2016-08-31 What is the current stage of this Pursuit?      AIG   Active1    2 - Solution Review
4  2016-12-13 What is the current stage of this Pursuit?      AIG   Active1    4 - Submit Proposal
5  2016-11-14                            Pursuit status:  Allianz   Active1                   <NA>
5  2016-10-27 What is the current stage of this Pursuit?  Allianz   Active1      Pre-Qualification
7  2017-05-18 What is the current stage of this Pursuit?  Allianz   Active1 1 - Opportunity Review
8  2017-05-18 What is the current stage of this Pursuit?  Allianz   Active1      Pre-Qualification
9  2017-08-24                            Pursuit status:  Allianz Inactive1                   <NA>
10 2016-10-27 What is the current stage of this Pursuit?  Allianz Inactive1      Pre-Qualification
11 2016-11-14                            Pursuit status:  Allianz   Active2                   <NA>
12 2016-12-19 What is the current stage of this Pursuit?  Allianz   Active2 1 - Opportunity Review
13 2017-04-14 What is the current stage of this Pursuit?  Allianz   Active2    2 - Solution Review

有办法做到这一点吗?我认为最好的方法是获取第一个状态的日期和第二个状态的日期,并填写第一个状态之间的所有值。

1 个答案:

答案 0 :(得分:1)

tidyr方法:

您似乎已经解决了排序问题,因此这里有代码来复制数据集的最小示例。

df_orig <- 
    read.table(text = "
             label   status 
               AIG   Active1
               AIG     <NA> 
               AIG     <NA> 
               AIG     <NA> 
           Allianz   Active1
           Allianz     <NA> 
           Allianz     <NA>
           Allianz     <NA> 
           Allianz Inactive1
           Allianz     <NA> 
           Allianz   Active2
           Allianz     <NA> 
           Allianz     <NA> 
                  ", header = TRUE, stringsAsFactors = FALSE) %>% 
    mutate(status = sub("<NA>", NA, status)) # to turn the "<NA>" to active NA's
> str(df_orig)
'data.frame': 13 obs. of  2 variables:
 $ label : chr  "AIG" "AIG" "AIG" "AIG" ...
 $ status: chr  "Active1" NA NA NA ...

填写'空缺'

df_filled <-  
    df_orig %>% 
    fill(status)

结果:

> df_filled
     label    status
1      AIG   Active1
2      AIG   Active1
3      AIG   Active1
4      AIG   Active1
5  Allianz   Active1
6  Allianz   Active1
7  Allianz   Active1
8  Allianz   Active1
9  Allianz Inactive1
10 Allianz Inactive1
11 Allianz   Active2
12 Allianz   Active2
13 Allianz   Active2