将具有年度值的行拆分为具有月度值的行

时间:2018-06-01 11:38:27

标签: r

我有一张年度总值的表格。我想通过划分" Total"将它们分成月值。 12号。

library(readr)
myData = read_delim("Date,b,c,d,Total\n2018,NA,NA,NA,12\n2018,0.5,0.5,NA,24\n2018,0.3,NA,0.5,36\n", delim=",")
myData 
# A tibble: 3 x 5
   Date     b     c     d Total
  <int> <dbl> <dbl> <dbl> <int>
1  2018  NA    NA    NA      12
2  2018   0.5   0.5  NA      24
3  2018   0.3  NA     0.5    36

所需的输出(对于第一行,我预计共有36行):

   Date       b     c     d     Total
 1 2018-01-01 NA    NA    NA        1
 2 2018-02-01 NA    NA    NA        1
 3 2018-03-01 NA    NA    NA        1
 4 2018-04-01 NA    NA    NA        1
 5 2018-05-01 NA    NA    NA        1
 6 2018-06-01 NA    NA    NA        1
 7 2018-07-01 NA    NA    NA        1
 8 2018-08-01 NA    NA    NA        1
 9 2018-09-01 NA    NA    NA        1
10 2018-10-01 NA    NA    NA        1
11 2018-11-01 NA    NA    NA        1
12 2018-12-01 NA    NA    NA        1

我已在此处检查了已接受的答案:Break summed row into individual rows in R 但不幸的是,这不适合我。

1 个答案:

答案 0 :(得分:0)

如果您确实需要每月日期,可以使用complete来执行此操作。

我根据Total创建了每年的唯一ID(您可能需要根据数据的实际组织方式进行操作)。然后,我将您的日期列设置为基于第一年的日期。然后我使用complete填写了一年中剩下的几个月。 fill用于完成行,mutate用于将总数除以12。

library(dplyr)
library(tidyr)
library(readr)
library(lubridate)

myData = read_delim("Date,b,c,d,Total\n2018,NA,NA,NA,12\n2018,0.5,0.5,NA,24\n2018,0.3,NA,0.5,36\n", delim=",")
myData 
#  # A tibble: 3 x 5
#     Date      b      c      d Total
#    <int>  <dbl>  <dbl>  <dbl> <int>
#  1  2018 NA     NA     NA        12
#  2  2018  0.500  0.500 NA        24
#  3  2018  0.300 NA      0.500    36

myData %>%
  mutate(group_id = group_indices(., Total)) %>% 
  mutate(Date = dmy(paste("01/01/",Date))) %>% 
  group_by(group_id) %>% 
  complete(Date = seq.Date(Date[1],by = "month",length.out = 12)) %>% 
  fill(b,c,d,Total) %>% 
  mutate(Total = Total/12) %>%
  ungroup() %>% 
  select(-group_id)

#  # A tibble: 36 x 5
#     Date           b     c     d Total
#     <date>     <dbl> <dbl> <dbl> <dbl>
#   1 2018-01-01    NA    NA    NA  1.00
#   2 2018-02-01    NA    NA    NA  1.00
#   3 2018-03-01    NA    NA    NA  1.00
#   4 2018-04-01    NA    NA    NA  1.00
#   5 2018-05-01    NA    NA    NA  1.00
#   6 2018-06-01    NA    NA    NA  1.00
#   7 2018-07-01    NA    NA    NA  1.00
#   8 2018-08-01    NA    NA    NA  1.00
#   9 2018-09-01    NA    NA    NA  1.00
#  10 2018-10-01    NA    NA    NA  1.00
#  # ... with 26 more rows