R - 根据开始和结束日期序列复制行

时间:2015-10-17 00:10:52

标签: r

我有一个数据框" DF"像这样:

Flight.Start   Flight.End   Device      Partner   Creative   Days.in.Flight 
2015-08-31     2015-08-31   Standard    MSN       Video      35

我需要做的是"吹嘘"像这样:

Flight.Start   Flight.End   Date         Device      Partner   Creative   Days.in.Flight 
2015-08-31     2015-10-04   2015-08-31   Standard    MSN       Video      35
2015-08-31     2015-10-04   2015-09-01   Standard    MSN       Video      35
2015-08-31     2015-10-04   2015-09-02   Standard    MSN       Video      35
2015-08-31     2015-10-04   2015-09-03   Standard    MSN       Video      35
2015-08-31     2015-10-04   2015-09-04   Standard    MSN       Video      35
2015-08-31     2015-10-04   2015-09-05   Standard    MSN       Video      35
2015-08-31     2015-10-04   2015-09-06   Standard    MSN       Video      35
2015-08-31     2015-10-04   2015-09-07   Standard    MSN       Video      35

ETC ......直到Date变量达到2015-10-04,然后转到下一个重复

基本上每一行都会被飞行天数 - 1 重复(因为已经存在的行可以占用一天中的一天,然后是一个新列"日期"填写该航班的相关日期。因此,如果一行的开始和结束日期分别为9/1和9/5,则4个重复的行将附加到已存在的行,一个新的将创建列(日期),并且原始行的航班起始日期和结束日期的日期顺序将填写列值。

所有日期值都被格式化为日期,飞行天数是一个数字,其余的是因子。

修改

回复重复的问题标记:

为了澄清,这不像被标记为重复的情况,因为我的问题并不是真正关注如何根据飞行天数复制(我已经知道如何做到这一点!),而是如何然后,我可以将列添加到该输出数据框,并在相应的航班期内依次插入日期。谢谢你们抬头......

3 个答案:

答案 0 :(得分:7)

以下是splitstackshapedplyr的一种方法。使用expandRows()包中的splitstackshape,您可以按照描述扩展数据框。然后,您想使用mutate()添加一系列日期。我所做的是按Flight.StartFlight.End的组合对数据进行分组,并使用seq()为每个组创建一个日期序列。 first()正在使用Flight.StartFlight.End的第一个元素。通过这种方式,您可以创建所需的序列。我希望这会对你有所帮助。

数据和代码

mydf <- data.frame(Flight.Start = as.Date(c("2015-09-01", "2015-09-10")),
                   Flight.End = as.Date(c("2015-09-03", "2015-09-15")),
                   Device = "Standard",
                   Creative = "Video",
                   Days.in.Flight = c(3, 6),
                   stringsAsFactors = FALSE)

#  Flight.Start Flight.End   Device Creative Days.in.Flight
#1   2015-09-01 2015-09-03 Standard    Video              3
#2   2015-09-10 2015-09-15 Standard    Video              6

library(splitstackshape)
library(dplyr)

expandRows(mydf, "Days.in.Flight", drop = FALSE) %>%
group_by(Flight.Start, Flight.End) %>%
mutate(Date = seq(first(Flight.Start),
                  first(Flight.End),
                  by = 1))

#  Flight.Start Flight.End   Device Creative Days.in.Flight       Date
#        (date)     (date)    (chr)    (chr)          (dbl)     (date)
#1   2015-09-01 2015-09-03 Standard    Video              3 2015-09-01
#2   2015-09-01 2015-09-03 Standard    Video              3 2015-09-02
#3   2015-09-01 2015-09-03 Standard    Video              3 2015-09-03
#4   2015-09-10 2015-09-15 Standard    Video              6 2015-09-10
#5   2015-09-10 2015-09-15 Standard    Video              6 2015-09-11
#6   2015-09-10 2015-09-15 Standard    Video              6 2015-09-12
#7   2015-09-10 2015-09-15 Standard    Video              6 2015-09-13
#8   2015-09-10 2015-09-15 Standard    Video              6 2015-09-14
#9   2015-09-10 2015-09-15 Standard    Video              6 2015-09-15

答案 1 :(得分:5)

或者使用data.table,我们会转换&#39; data.frame&#39;到&#39; data.table&#39; (setDT(mydf)),按照&#39; Days.in.Flight&#39;复制行序列,根据该索引,我们对数据集(.SD[rep(...)进行子集,按&#39;分组。 Flight.Start&#39;和&#39; Flight.End&#39;,我们创建了&#39;日期&#39;列。

library(data.table)
setDT(mydf)[, .SD[rep(1:.N, Days.in.Flight)]][, 
     Date:= seq(Flight.Start , Flight.End, by = '1 day'),
     by = .(Flight.Start, Flight.End)][]

答案 2 :(得分:1)

以下是基础R的方法:

mydf <- data.frame(Flight.Start = as.Date(c("2015-09-01", "2015-09-10")),
                   Flight.End = as.Date(c("2015-09-03", "2015-09-15")),
                   Device = "Standard",
                   Creative = "Video",
                   Days.in.Flight = c(3, 6),
                   stringsAsFactors = FALSE)

expanded <-mydf[rep(row.names(mydf), mydf$ Days.in.Flight), ]
data.frame(expanded,Date=expanded$Flight.Start+(sequence(mydf$Days.in.Flight)-1))

> data.frame(expanded,Date=expanded$Flight.Start+(sequence(mydf$Days.in.Flight)-1))
    Flight.Start Flight.End   Device Creative Days.in.Flight       Date
1     2015-09-01 2015-09-03 Standard    Video              3 2015-09-01
1.1   2015-09-01 2015-09-03 Standard    Video              3 2015-09-02
1.2   2015-09-01 2015-09-03 Standard    Video              3 2015-09-03
2     2015-09-10 2015-09-15 Standard    Video              6 2015-09-10
2.1   2015-09-10 2015-09-15 Standard    Video              6 2015-09-11
2.2   2015-09-10 2015-09-15 Standard    Video              6 2015-09-12
2.3   2015-09-10 2015-09-15 Standard    Video              6 2015-09-13
2.4   2015-09-10 2015-09-15 Standard    Video              6 2015-09-14
2.5   2015-09-10 2015-09-15 Standard    Video              6 2015-09-15