在出现行时标记数据框中的列

时间:2016-11-02 12:26:19

标签: r database data.table

我有一个包含客户订单的数据库。订单的订购时间越来越长。我向您展示了其中两位客户:

data<-data.frame(ID_CLIENTE=c(rep(1,8),rep(2,8)),
    PEDIDO=c("A1","A2","A3","A4","A5","A6","A7","A8", "B1","B2","B3","B4","B5","B6","B7","B8"), LABEL= c(NA, NA, "1ER_PEDIDO", NA, NA, NA, NA, NA, NA, NA, NA, NA, "1ER_PEDIDO", NA, NA, NA), 
    DATE= as.Date(c("2014-09-22","2014-12-16","2015-01-19","2015-03-11", "2015-05-18", "2015-10-28","2016-04-13","2016-06-09","2014-10-08","2014-10-12","2014-10-26","2014-11-06","2014-11-24","2014-12-10","2014-12-11","2015-01-12")))

        > data
           ID_CLIENTE PEDIDO      LABEL       DATE
        1           1     A1       <NA> 2014-09-22
        2           1     A2       <NA> 2014-12-16
        3           1     A3 1ER_PEDIDO 2015-01-19
        4           1     A4       <NA> 2015-03-11
        5           1     A5       <NA> 2015-05-18
        6           1     A6       <NA> 2015-10-28
        7           1     A7       <NA> 2016-04-13
        8           1     A8       <NA> 2016-06-09
        9           2     B1       <NA> 2014-10-08
        10          2     B2       <NA> 2014-10-12
        11          2     B3       <NA> 2014-10-26
        12          2     B4       <NA> 2014-11-06
        13          2     B5 1ER_PEDIDO 2014-11-24
        14          2     B6       <NA> 2014-12-10
        15          2     B7       <NA> 2014-12-11
        16          2     B8       <NA> 2015-01-12

我想在标有&#34; 1ER_PEDIDO&#34;的订单之前和之后标记所有订单。结果数据框必须如下:

   ID_CLIENTE PEDIDO      LABEL       DATE
1           1     A1     BEFORE 2014-09-22
2           1     A2     BEFORE 2014-12-16
3           1     A3 1ER_PEDIDO 2015-01-19
4           1     A4      AFTER 2015-03-11
5           1     A5      AFTER 2015-05-18
6           1     A6      AFTER 2015-10-28
7           1     A7      AFTER 2016-04-13
8           1     A8      AFTER 2016-06-09
9           2     B1     BEFORE 2014-10-08
10          2     B2     BEFORE 2014-10-12
11          2     B3     BEFORE 2014-10-26
12          2     B4     BEFORE 2014-11-06
13          2     B5 1ER_PEDIDO 2014-11-24
14          2     B6      AFTER 2014-12-10
15          2     B7      AFTER 2014-12-11
16          2     B8      AFTER 2015-01-12

我应该使用data.table函数吗?我必须按客户标记所有订单,我必须修复客户并检查所有订单maden。然后,我想标记它们。

2 个答案:

答案 0 :(得分:3)

以下是两个步骤中的data.table方法:

library(data.table)
setDT(data)

data[data[, DATE < DATE[LABEL == "1ER_PEDIDO" & !is.na(LABEL)], by = ID_CLIENTE]$V1,
     LABEL := "BEFORE"]

data[data[, DATE > DATE[LABEL == "1ER_PEDIDO" & !is.na(LABEL)], by = ID_CLIENTE]$V1,
     LABEL := "AFTER"]

答案 1 :(得分:1)

这是一个dplyr解决方案,它基于@nicola和@akrun对my question的回答:

library(tidyverse)
data %>% 
group_by(ID_CLIENTE) %>% 
mutate(LABEL=c('BEFORE','1ER_PEDIDO','AFTER')
              [sign(seq_along(LABEL)-match('1ER_PEDIDO', LABEL))+2])