如何将缺失的月份添加到数据框中?

时间:2018-10-03 19:40:50

标签: r

我有一个包含三个观测值的数据集:一月,二月和三月。我想将剩余的几个月作为零观测值添加到同一数据表中,但是我很难将它们追加。

这是我当前的代码:

library(dplyr)

Period <- c("January 2015", "February 2015", "March 2015",
            "January 2016", "February 2016", "March 2016",
            "January 2017", "February 2017", "March 2017",
            "January 2018", "February 2018", "March 2018")

Month <- c("January", "February", "March",
           "January", "February", "March",
           "January", "February", "March",
           "January", "February", "March")

Dollars <- c(936, 753, 731, 
             667, 643, 588, 
             948, 894, 997, 
             774,745, 684)

dat <- data.frame(Period = Period, Month = Month, Dollars = Dollars)

dat2 <- dat %>%
  dplyr::select(Month, Dollars) %>%
  dplyr::group_by(Month) %>%
  dplyr::summarise(AvgDollars = mean(Dollars))

在数据集中填充四月到十二月的任何想法都将受到赞赏。预先感谢!

3 个答案:

答案 0 :(得分:2)

以下是使用complete进行此操作的一种方法:

library(tidyverse)

然后使用完成:

dat2 <- data.frame(Period = Period, Month = Month, Dollars = Dollars) %>% 
  # make a "year" variable
  mutate(Year = word(Period, 2,2)) %>% 
  # remove period variable (we'll add it in later)
  select(-Period) %>% 
  # month.name is a base variable listing all months (thanks @Gregor).
  # nesting by "Year" lets complete know you only want the years listed in your dataset.
  complete(Month = month.name, nesting(Year), fill = list(Dollars = 0)) %>% 
  # Arrange by Year and month
  arrange(Year, Month) %>% 
  #remake the "period" variable 
  mutate(Period = paste(Month, Year)) %>% 
  group_by(Month) %>% 
  summarise(AvgDollars = mean(Dollars))

答案 1 :(得分:1)

dplyr也许有一个更优雅的解决方案,但是这里是一个无需太多输入的快速解决方案:

dat <- rbind(data.frame(Period = Period, Month = Month, Dollars = Dollars),
             data.frame(Period = c(sapply(2015:2018, function(x) format(ISOdate(x,4:12,1),"%B %Y"))),
                        Month = c(sapply(2015:2018, function(x) format(ISOdate(x,4:12,1),"%B"))),
                        Dollars = 0))

答案 2 :(得分:1)

这是一个两步解决方案:

library(dplyr)
Sys.setlocale("LC_TIME", "English")
# first, define a dataframe with each month from January 2015 to December 2018
dat2 <- data.frame(Period = format(seq(as.Date("2015/1/1"),
                                       as.Date("2018/12/1"), by = "month"),
                                   format = "%B %Y"),
                   Month = substr(Period, 1, nchar(Period)-5)) 
# then, merge dat and dat2
dat %>%
  select(Period, Dollars) %>%
  right_join(dat2, by = "Period") %>%
  select(Period, Month, Dollars)
           Period    Month Dollars
1    January 2015  January     936
2   February 2015 February     753
3      March 2015    March     731
4      April 2015  January      NA
5        May 2015 February      NA
6       June 2015    March      NA
7       July 2015  January      NA
8     August 2015 February      NA
9  September 2015    March      NA
10   October 2015  January      NA
11  November 2015 February      NA
12  December 2015    March      NA
13   January 2016  January     667
14  February 2016 February     643
15     March 2016    March     588
16     April 2016  January      NA
17       May 2016 February      NA
18      June 2016    March      NA
19      July 2016  January      NA
20    August 2016 February      NA
21 September 2016    March      NA
22   October 2016  January      NA
23  November 2016 February      NA
24  December 2016    March      NA
25   January 2017  January     948
26  February 2017 February     894
27     March 2017    March     997
28     April 2017  January      NA
29       May 2017 February      NA
30      June 2017    March      NA
31      July 2017  January      NA
32    August 2017 February      NA
33 September 2017    March      NA
34   October 2017  January      NA
35  November 2017 February      NA
36  December 2017    March      NA
37   January 2018  January     774
38  February 2018 February     745
39     March 2018    March     684
40     April 2018  January      NA
41       May 2018 February      NA
42      June 2018    March      NA
43      July 2018  January      NA
44    August 2018 February      NA
45 September 2018    March      NA
46   October 2018  January      NA
47  November 2018 February      NA
48  December 2018    March      NA