根据上一行中的日期向数据框添加行

时间:2014-11-10 06:45:10

标签: r

我有一个数据框precip_range

start_date<-as.Date(c("2010-4-01", "2010-4-02", "2010-04-04", "2010-07-02", "2010-07-02", "2010-07-03"))  
end_date<-as.Date(c("2010-7-01", "2010-07-01", "2010-07-02", "2010-10-03", "2010-10-04", "2010-10-03"))
date_category<-(c("A", "A", "A", "B", "B", "B"))
site <-c("Site 1", "Site 2", "Site 3", "Site 1", "Site 2", "Site 3")
precip_range<-data.frame(site, start_date, end_date, date_category)
precip_range$days <-(end_date-start_date)

我想添加一个列Date并添加值Date的行,这些行填充start_dateend_date之间的日期差距site 。除Date之外的所有列都应保留与precip_range中相同的信息。我希望结果数据框的前几行看起来与数据框result_example类似:

date<-as.Date(c("2010-04-01", "2010-04-02", "2010-04-03", "2010-04-04", "2010-04-05", "2010-04-06"))
result_date_category <-c("A", "A", "A", "A", "A", "A")
result_site <-c("Site 1", "Site 1", "Site 1", "Site 1", "Site 1", "Site 1")
result_start_date <-as.Date(c("2010-04-01", "2010-04-01", "2010-04-01", "2010-04-01", "2010-04-01","2010-04-01"))
result_end_date <-as.Date(c("2010-07-01", "2010-07-01", "2010-07-01", "2010-07-01", "2010-07-01","2010-07-01"))
result_example <-data.frame(date, result_site, result_start_date, result_end_date, result_date_category)
result_example$days <-(result_end_date-result_start_date)

我的问题类似于[In R: Add rows with data of previous row to data frame,但我无法成功地根据我的情况调整答案。谢谢。

1 个答案:

答案 0 :(得分:3)

尝试以下

diffs <- with(precip_range, end_date - start_date + 1)
result_site <- precip_range[rep(seq_len(nrow(precip_range)), diffs), ]

library(data.table)
setDT(result_site)[, Date := seq.int(start_date[1], end_date[1], by = "day"), 
                     by = list(site, date_category)]
result_site

#        site start_date   end_date date_category    days       Date
#   1: Site 1 2010-04-01 2010-07-01             A 91 days 2010-04-01
#   2: Site 1 2010-04-01 2010-07-01             A 91 days 2010-04-02
#   3: Site 1 2010-04-01 2010-07-01             A 91 days 2010-04-03
#   4: Site 1 2010-04-01 2010-07-01             A 91 days 2010-04-04
#   5: Site 1 2010-04-01 2010-07-01             A 91 days 2010-04-05
# ---                                                              
# 551: Site 3 2010-07-03 2010-10-03             B 92 days 2010-09-29
# 552: Site 3 2010-07-03 2010-10-03             B 92 days 2010-09-30
# 553: Site 3 2010-07-03 2010-10-03             B 92 days 2010-10-01
# 554: Site 3 2010-07-03 2010-10-03             B 92 days 2010-10-02
# 555: Site 3 2010-07-03 2010-10-03             B 92 days 2010-10-03

在这里,我们计算了日期差异,并根据每个差异的大小执行行索引。之后,我们使用data.table包添加了日期差异(以提高性能)