回填R中的先前日期

时间:2018-09-05 18:30:49

标签: r

让我们看一个简单的数据框

Ingredient::findOrFail($ids)->images()->delete()
Ingredient::findOrFail($ids)->delete()

给出以下内容

::-webkit-scrollbar-thumb {
    background:url(../images/knob.png) no-repeat;
}

在我的实际数据中,我多次发生这种情况。如何回填上个月的开始日期。

理想情况下,我想使用structure(list(a = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("a", "b"), class = "factor"), dt = structure(c(NA, 17287, 17318, NA, 17379, 17410), class = "Date")), .Names = c("a", "dt"), row.names = c(NA, -6L), class = "data.frame") 进行此操作。我能得到的最接近的结果是使用 a dt 1 a <NA> 2 a 2017-05-01 3 a 2017-06-01 4 b <NA> 5 b 2017-08-01 6 b 2017-09-01 dplyr,从而使最后一个日期成为lubridate::floor_date

dplyr::lead

将不胜感激。

2 个答案:

答案 0 :(得分:0)

您实际上真的很接近答案。除了lubridate,您还需要软件包dplyr

tmp <- structure(list(a = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("a", "b"), class = "factor"), 
                      dt = structure(c(NA, 17287, 17318, NA, 17379, 17410), class = "Date")),
                 .Names = c("a", "dt"), 
                 row.names = c(NA, -6L), 
                 class = "data.frame")

library(lubridate)
library(dplyr)

tmp %>%
  group_by(a) %>%
  mutate(newDT = if_else(is.na(dt), lead(dt) %m-% months(1), dt))
tmp

# A tibble: 6 x 3
# Groups:   a [2]
  a     dt         newDT     
  <fct> <date>     <date>    
1 a     NA         2017-04-01
2 a     2017-05-01 2017-05-01
3 a     2017-06-01 2017-06-01
4 b     NA         2017-07-01
5 b     2017-08-01 2017-08-01
6 b     2017-09-01 2017-09-01

我不擅长在R中使用Excel样式的日期,但是我认为一旦到达这里,您就可以将newDT转换为所需的格式。 (编辑:感谢@phiver纠正了我的代码!)

答案 1 :(得分:0)

我认为,如果NA的相邻dt值大于1,则当前接受的解决方案将不起作用。

这是另一种选择,请注意顺序很重要:

解决方案

dat

  a         dt
1 a       <NA>
2 a       <NA>
3 a 2017-05-01
4 a 2017-06-01
5 b       <NA>
6 b 2017-08-01
7 b 2017-09-01

library(dplyr)
library(tidyr)

dat %>%
  group_by(a) %>%
  mutate(helper = ifelse(is.na(dt), NA, cumsum(!is.na(dt)))) %>%
  fill(helper, .direction = 'up') %>%
  group_by(a, helper) %>%
  mutate(dt = coalesce(dt,
                       max(dt, na.rm = TRUE) - months(max(row_number()) - row_number()))) %>%
  dplyr::select(-helper)

# A tibble: 7 x 3
# Groups:   a, helper [4]
  helper a     dt        
   <int> <fct> <date>    
1      1 a     2017-03-01
2      1 a     2017-04-01
3      1 a     2017-05-01
4      2 a     2017-06-01
5      1 b     2017-07-01
6      1 b     2017-08-01
7      2 b     2017-09-01

数据

dat <-structure(list(a = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("a", 
"b"), class = "factor"), dt = structure(c(NA, NA, 17287, 17318, 
NA, 17379, 17410), class = "Date")), .Names = c("a", "dt"), row.names = c(NA, 
-7L), class = "data.frame")