在给定日期范围内重复值的解决方案

时间:2019-09-25 07:54:35

标签: r date dataframe

Error in seq.Date(as.Date(retail$Valid_from), as.Date(retail$Valid_to),  : 
  'from' must be of length 1

我尝试了问题中提到的两种方法:

我基本上想在给定的日期范围内重复每天的数量:

HSD_RSP            Valid_from   Valid_to
70                 1/1/2018     15/1/2018
80                 1/16/2018    1/31/2018
.
.
.

方法1:

byDay = ddply(retail, .(HSD_RSP), transform, 
              day=seq(as.Date(retail$Valid_from), as.Date(retail$Valid_to), by="day"))

方法2:

dt <- data.table(retail)
dt <- dt[,seq(as.Date(Valid_from),as.Date(Valid_to),by="day"),
         by=list(HSD_RSP)]

HSD_RSP      final_date
70             1/1/2018
70           2/1/2018
70           3/1/2018
70           4/1/2018
.
.
.

的输出

dput(head(retail))

structure(list(HSD_RSP = c(61.68, 62.96, 63.14, 60.51, 60.34, 
61.63), Valid_from = structure(c(1483315200, 1484524800, 1487116800, 
1491004800, 1491523200, 1492300800), class = c("POSIXct", "POSIXt"
), tzone = "UTC"), Valid_to = structure(c(1484438400, 1487030400, 
1490918400, 1491436800, 1492214400, 1493510400), class = c("POSIXct", 
"POSIXt"), tzone = "UTC")), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

2 个答案:

答案 0 :(得分:2)

转换为日期,在Valid_fromValid_tounnest之间创建日期序列

library(tidyverse)

df %>%
  mutate_at(vars(starts_with("Valid")), as.Date, "%m/%d/%Y") %>%
  mutate(Date = map2(Valid_from, Valid_to, seq, by = "1 day")) %>%
  unnest(Date) %>%
  select(-Valid_from, -Valid_to)

#  HSD_RSP   Date      
#     <int> <date>    
# 1      70 2018-01-01
# 2      70 2018-01-02
# 3      70 2018-01-03
# 4      70 2018-01-04
# 5      70 2018-01-05
# 6      70 2018-01-06
# 7      70 2018-01-07
# 8      70 2018-01-08
# 9      70 2018-01-09
#10      70 2018-01-10
# … with 21 more rows

数据

df <- structure(list(HSD_RSP = c(70L, 80L), Valid_from = structure(1:2, 
.Label = c("1/1/2018", "1/16/2018"), class = "factor"), Valid_to = 
structure(1:2, .Label = c("1/15/2018", "1/31/2018"), class = "factor")),
class = "data.frame", row.names = c(NA, -2L))

答案 1 :(得分:1)

使用data.table使用Ronak Shah的数据结构:

library(data.table)     
dt <- as.data.table(df1)
dt[, .(final_date = seq(as.Date(Valid_from, "%m/%d/%Y"), as.Date(Valid_to, "%m/%d/%Y"), by = "day")),
   by = HSD_RSP]

    HSD_RSP final_date
 1:      70 2018-01-01
 2:      70 2018-01-02
 3:      70 2018-01-03
 4:      70 2018-01-04
 ....

数据:

df <- structure(list(HSD_RSP = c(70L, 80L), Valid_from = structure(1:2, 
.Label = c("1/1/2018", "1/16/2018"), class = "factor"), Valid_to = 
structure(1:2, .Label = c("1/15/2018", "1/31/2018"), class = "factor")),
class = "data.frame", row.names = c(NA, -2L))