我有一个包含客户订单的数据库。订单的订购时间越来越长。我向您展示了其中两位客户:
data<-data.frame(ID_CLIENTE=c(rep(1,8),rep(2,8)),
PEDIDO=c("A1","A2","A3","A4","A5","A6","A7","A8", "B1","B2","B3","B4","B5","B6","B7","B8"), LABEL= c(NA, NA, "1ER_PEDIDO", NA, NA, NA, NA, NA, NA, NA, NA, NA, "1ER_PEDIDO", NA, NA, NA),
DATE= as.Date(c("2014-09-22","2014-12-16","2015-01-19","2015-03-11", "2015-05-18", "2015-10-28","2016-04-13","2016-06-09","2014-10-08","2014-10-12","2014-10-26","2014-11-06","2014-11-24","2014-12-10","2014-12-11","2015-01-12")))
> data
ID_CLIENTE PEDIDO LABEL DATE
1 1 A1 <NA> 2014-09-22
2 1 A2 <NA> 2014-12-16
3 1 A3 1ER_PEDIDO 2015-01-19
4 1 A4 <NA> 2015-03-11
5 1 A5 <NA> 2015-05-18
6 1 A6 <NA> 2015-10-28
7 1 A7 <NA> 2016-04-13
8 1 A8 <NA> 2016-06-09
9 2 B1 <NA> 2014-10-08
10 2 B2 <NA> 2014-10-12
11 2 B3 <NA> 2014-10-26
12 2 B4 <NA> 2014-11-06
13 2 B5 1ER_PEDIDO 2014-11-24
14 2 B6 <NA> 2014-12-10
15 2 B7 <NA> 2014-12-11
16 2 B8 <NA> 2015-01-12
我想在标有&#34; 1ER_PEDIDO&#34;的订单之前和之后标记所有订单。结果数据框必须如下:
ID_CLIENTE PEDIDO LABEL DATE
1 1 A1 BEFORE 2014-09-22
2 1 A2 BEFORE 2014-12-16
3 1 A3 1ER_PEDIDO 2015-01-19
4 1 A4 AFTER 2015-03-11
5 1 A5 AFTER 2015-05-18
6 1 A6 AFTER 2015-10-28
7 1 A7 AFTER 2016-04-13
8 1 A8 AFTER 2016-06-09
9 2 B1 BEFORE 2014-10-08
10 2 B2 BEFORE 2014-10-12
11 2 B3 BEFORE 2014-10-26
12 2 B4 BEFORE 2014-11-06
13 2 B5 1ER_PEDIDO 2014-11-24
14 2 B6 AFTER 2014-12-10
15 2 B7 AFTER 2014-12-11
16 2 B8 AFTER 2015-01-12
我应该使用data.table函数吗?我必须按客户标记所有订单,我必须修复客户并检查所有订单maden。然后,我想标记它们。
答案 0 :(得分:3)
以下是两个步骤中的data.table
方法:
library(data.table)
setDT(data)
data[data[, DATE < DATE[LABEL == "1ER_PEDIDO" & !is.na(LABEL)], by = ID_CLIENTE]$V1,
LABEL := "BEFORE"]
data[data[, DATE > DATE[LABEL == "1ER_PEDIDO" & !is.na(LABEL)], by = ID_CLIENTE]$V1,
LABEL := "AFTER"]
答案 1 :(得分:1)
这是一个dplyr解决方案,它基于@nicola和@akrun对my question的回答:
library(tidyverse)
data %>%
group_by(ID_CLIENTE) %>%
mutate(LABEL=c('BEFORE','1ER_PEDIDO','AFTER')
[sign(seq_along(LABEL)-match('1ER_PEDIDO', LABEL))+2])