如何根据条件在每行中添加日期

时间:2018-08-13 01:50:43

标签: r

我有数据帧before

before <- data.frame(id = c("a", "a"),
             date = c("2011-12-18","2011-12-24"),
             apple_days = c(3, 2),
             banana_days = c(3, 2),
             mango_days = c(1, 5))

,我想像下面的after一样更改它。
将根据apple_daysbanana_daysmango_days添加行,我想使date增加1。

after <- data.frame(id = c("a", "a", "a", "a", "a", "a", "a", "a"),
             date = c("2011-12-18","2011-12-19","2011-12-20","2011-12-24",
                      "2011-12-25","2011-12-26","2011-12-27","2011-12-28"),
             apple_days = c(1,1,1,1,1,"","",""),
             banana_days = c(1,1,1,1,1,"","",""),
             mango_days = c(1,"","",1,1,1,1,1))

3 个答案:

答案 0 :(得分:1)

基本R尝试:

vars <- c("apple_days", "banana_days", "mango_days")
cnt <- do.call(pmax, before[vars])
cntseq <- sequence(cnt)
after <- before[rep(seq_len(nrow(before)), cnt), ]
after[vars] <- lapply(after[vars], function(x) as.numeric(x >= cntseq) )
after$date <- as.Date(after$date) + cntseq - 1

设置感兴趣的变量,在每一行中找到最大值,然后重复这些行。检查原始值是否小于新行中的序列。将行序列添加到原始日期。

#    id       date apple_days banana_days mango_days
#1    a 2011-12-18          1           1          1
#1.1  a 2011-12-19          1           1          0
#1.2  a 2011-12-20          1           1          0
#2    a 2011-12-24          1           1          1
#2.1  a 2011-12-25          1           1          1
#2.2  a 2011-12-26          0           0          1
#2.3  a 2011-12-27          0           0          1
#2.4  a 2011-12-28          0           0          1

答案 1 :(得分:1)

这是一个tidyverse / lubridate选项

library(lubridate)
library(tidyverse)
before %>%
     mutate(ndays = do.call(pmax, .[, -(1:2)])) %>%
     rowwise() %>%
     mutate(tmp = list(ymd(date) + days(0:(ndays - 1)))) %>%
     unnest() %>%
     group_by(date) %>%
     mutate_at(vars(contains("_days")), function(x)
         replace(rep(0, length(x)), 1:unique(x), 1)) %>%
    ungroup() %>%
    select(-ndays, -date) %>%
    rename(date = tmp)
## A tibble: 8 x 5
#  id    apple_days banana_days mango_days date
#  <fct>      <dbl>       <dbl>      <dbl> <date>
#1 a             1.          1.         1. 2011-12-18
#2 a             1.          1.         0. 2011-12-19
#3 a             1.          1.         0. 2011-12-20
#4 a             1.          1.         1. 2011-12-24
#5 a             1.          1.         1. 2011-12-25
#6 a             0.          0.         1. 2011-12-26
#7 a             0.          0.         1. 2011-12-27
#8 a             0.          0.         1. 2011-12-28

这似乎很长,我很想知道较短的dplyr / tidyverse选项。


使用rlang::!!!语法的细微变化

before %>%
     rowwise() %>%
     mutate(
         ndays = max(!!!syms(grep("_days", names(.), value = T))),
         tmp = list(ymd(date) + days(0:(ndays - 1)))) %>%
     unnest() %>%
     group_by(date) %>%
     mutate_at(vars(contains("_days")), function(x)
         replace(rep(0, length(x)), 1:unique(x), 1)) %>%
     ungroup() %>%
     select(-ndays, -date) %>%
     rename(date = tmp)

答案 2 :(得分:1)

这是tidyverse

的一个选项
library(tidyverse)
before %>%
  mutate(ndays = pmax(!!! rlang::syms(names(.)[-(1:2)])), 
  date = map2(as.Date(date), ndays,  ~ seq(.x, .x + .y - 1, by = 'day')))  %>% 
  rowwise()  %>%
  mutate_at(vars(matches("_days")),
           funs(list(+!is.na(`length<-`(seq(.), ndays))))) %>% 
  unnest 
# A tibble: 8 x 5
#  id    date       apple_days banana_days mango_days
#  <fct> <date>          <int>       <int>      <int>
#1 a     2011-12-18          1           1          1
#2 a     2011-12-19          1           1          0
#3 a     2011-12-20          1           1          0
#4 a     2011-12-24          1           1          1
#5 a     2011-12-25          1           1          1
#6 a     2011-12-26          0           0          1
#7 a     2011-12-27          0           0          1
#8 a     2011-12-28          0           0          1