我有一个数据集
dt <- data.table(Customer = c("a", "a","b","b"), months = c(2,2,2,3), Date = c("2014-03-1","2015-10-1","2015-01-1","2016-01-1"), Cost = c("100","200","50","20"))
Customer months Date Cost
1: a 2 2014-03-1 100
2: a 2 2015-10-1 200
3: b 2 2015-01-1 50
4: b 3 2016-01-1 20
我希望按月数重复每一行
dt %>% mutate(New.Date.month = as.Date(Date), rn1 = row_number()) %>%
slice(rep(rn1, months))%>%
group_by(Customer, rn1) %>%
mutate(New.Date.month = seq(first(Date), by="1 month", length.out=n()))
Customer months Date Cost New.Date.month rn1
<chr> <dbl> <date> <chr> <date> <int>
1 a 2 2014-03-01 100 2014-03-01 1
2 a 2 2014-03-01 100 2014-04-01 1
3 a 2 2015-10-01 200 2015-10-01 2
4 a 2 2015-10-01 200 2015-11-01 2
5 b 2 2015-01-01 50 2015-01-01 3
6 b 2 2015-01-01 50 2015-02-01 3
7 b 3 2016-01-01 20 2016-01-01 4
8 b 3 2016-01-01 20 2016-02-01 4
9 b 3 2016-01-01 20 2016-03-01 4
>
但是,我希望对客户进行分组,并将“ New.Date.Month”增加1个月的增量...所以我想要的输出看起来像
Customer months Date Cost New.Date.month rn1
<chr> <dbl> <date> <chr> <date> <int>
1 a 2 2014-03-01 100 2014-03-01 1
2 a 2 2014-03-01 100 2014-04-01 1
3 a 2 2015-10-01 200 2014-05-01 2
4 a 2 2015-10-01 200 2014-06-01 2
5 b 2 2015-01-01 50 2015-01-01 3
6 b 2 2015-01-01 50 2015-02-01 3
7 b 3 2016-01-01 20 2015-03-01 4
8 b 3 2016-01-01 20 2015-04-01 4
9 b 3 2016-01-01 20 2015-05-01 4
我将非常感谢您的帮助。
谢谢。
答案 0 :(得分:2)
我们需要从group_by
步骤中删除“ rn1”
library(dplyr)
dt %>%
mutate(New.Date.month = as.Date(Date), rn1 = row_number()) %>%
slice(rep(rn1, months))%>%
group_by(Customer) %>%
mutate(New.Date.month = seq(first(New.Date.month), by="1 month", length.out=n()))
# A tibble: 9 x 6
# Groups: Customer [2]
# Customer months Date Cost New.Date.month rn1
# <chr> <dbl> <chr> <chr> <date> <int>
#1 a 2 2014-03-1 100 2014-03-01 1
#2 a 2 2014-03-1 100 2014-04-01 1
#3 a 2 2015-10-1 200 2014-05-01 2
#4 a 2 2015-10-1 200 2014-06-01 2
#5 b 2 2015-01-1 50 2015-01-01 3
#6 b 2 2015-01-1 50 2015-02-01 3
#7 b 3 2016-01-1 20 2015-03-01 4
#8 b 3 2016-01-1 20 2015-04-01 4
#9 b 3 2016-01-1 20 2015-05-01 4
可以用uncount
简化(无需创建“ rn1”列)
library(tidyr)
dt %>%
uncount(months) %>%
group_by(Customer) %>%
mutate(New.Date.month = seq(as.Date(first(Date)),
by = "1 month", length.out = n()))
# A tibble: 9 x 4
# Groups: Customer [2]
# Customer Date Cost New.Date.month
# <chr> <chr> <chr> <date>
#1 a 2014-03-1 100 2014-03-01
#2 a 2014-03-1 100 2014-04-01
#3 a 2015-10-1 200 2014-05-01
#4 a 2015-10-1 200 2014-06-01
#5 b 2015-01-1 50 2015-01-01
#6 b 2015-01-1 50 2015-02-01
#7 b 2016-01-1 20 2015-03-01
#8 b 2016-01-1 20 2015-04-01
#9 b 2016-01-1 20 2015-05-01
此外,由于初始数据集为data.table
,我们也可以使用data.table
方法
library(data.table)
dt[rep(seq_len(.N), months)][, New.Date.month := seq(as.Date(Date)[1],
by = "1 month", length.out = .N), Customer][]