按项目填充时间序列中缺少的日期月份

时间:2021-03-03 06:21:41

标签: r dplyr time-series tidyverse

我对 R 还很陌生,我正在尝试执行以下任务:

我有以下数据集:

df1 <- data.frame(ITEM = c("A","A","A","A","A","B","B","B","B","B"),
              Date = c("Jan-2020","Feb-2020","May-2020","Jun-2020","Jul-2020","Jan-2020","Apr-2020","Jun-2020","Jul-2020","Aug-2020"))

这是一张图片:

enter image description here

我使用库“zoo”将日期列更改为 yearmon,并且我正在尝试为缺少的“yearmon”日期创建行。所以是这样的:

enter image description here

有人知道我该怎么做吗?

谢谢

1 个答案:

答案 0 :(得分:2)

您可以为每个 yearmon 创建一系列 ITEM 对象并在 complete 中使用它。

library(dplyr)
library(zoo)
library(tidyr)

df1 %>%
  mutate(Date = as.yearmon(Date, '%b-%Y')) %>%
  group_by(ITEM) %>%
  complete(Date = seq(min(Date), max(Date), 1/12)) %>%
  ungroup

#   ITEM  Date     
#   <chr> <yearmon>
# 1 A     Jan 2020 
# 2 A     Feb 2020 
# 3 A     Mar 2020 
# 4 A     Apr 2020 
# 5 A     May 2020 
# 6 A     Jun 2020 
# 7 A     Jul 2020 
# 8 B     Jan 2020 
# 9 B     Feb 2020 
#10 B     Mar 2020 
#11 B     Apr 2020 
#12 B     May 2020 
#13 B     Jun 2020 
#14 B     Jul 2020 
#15 B     Aug 2020 

如果你想要一系列日期对象,你可以使用:

df1 %>%
  mutate(Date = as.Date(as.yearmon(Date, '%b-%Y'))) %>%
  group_by(ITEM) %>%
  complete(Date = seq(min(Date), max(Date), 'month')) %>%
  ungroup()