我有数据帧before
before <- data.frame(id = c("a", "a"),
date = c("2011-12-18","2011-12-24"),
apple_days = c(3, 2),
banana_days = c(3, 2),
mango_days = c(1, 5))
,我想像下面的after
一样更改它。
将根据apple_days
,banana_days
,mango_days
添加行,我想使date
增加1。
after <- data.frame(id = c("a", "a", "a", "a", "a", "a", "a", "a"),
date = c("2011-12-18","2011-12-19","2011-12-20","2011-12-24",
"2011-12-25","2011-12-26","2011-12-27","2011-12-28"),
apple_days = c(1,1,1,1,1,"","",""),
banana_days = c(1,1,1,1,1,"","",""),
mango_days = c(1,"","",1,1,1,1,1))
答案 0 :(得分:1)
基本R尝试:
vars <- c("apple_days", "banana_days", "mango_days")
cnt <- do.call(pmax, before[vars])
cntseq <- sequence(cnt)
after <- before[rep(seq_len(nrow(before)), cnt), ]
after[vars] <- lapply(after[vars], function(x) as.numeric(x >= cntseq) )
after$date <- as.Date(after$date) + cntseq - 1
设置感兴趣的变量,在每一行中找到最大值,然后重复这些行。检查原始值是否小于新行中的序列。将行序列添加到原始日期。
# id date apple_days banana_days mango_days
#1 a 2011-12-18 1 1 1
#1.1 a 2011-12-19 1 1 0
#1.2 a 2011-12-20 1 1 0
#2 a 2011-12-24 1 1 1
#2.1 a 2011-12-25 1 1 1
#2.2 a 2011-12-26 0 0 1
#2.3 a 2011-12-27 0 0 1
#2.4 a 2011-12-28 0 0 1
答案 1 :(得分:1)
这是一个tidyverse
/ lubridate
选项
library(lubridate)
library(tidyverse)
before %>%
mutate(ndays = do.call(pmax, .[, -(1:2)])) %>%
rowwise() %>%
mutate(tmp = list(ymd(date) + days(0:(ndays - 1)))) %>%
unnest() %>%
group_by(date) %>%
mutate_at(vars(contains("_days")), function(x)
replace(rep(0, length(x)), 1:unique(x), 1)) %>%
ungroup() %>%
select(-ndays, -date) %>%
rename(date = tmp)
## A tibble: 8 x 5
# id apple_days banana_days mango_days date
# <fct> <dbl> <dbl> <dbl> <date>
#1 a 1. 1. 1. 2011-12-18
#2 a 1. 1. 0. 2011-12-19
#3 a 1. 1. 0. 2011-12-20
#4 a 1. 1. 1. 2011-12-24
#5 a 1. 1. 1. 2011-12-25
#6 a 0. 0. 1. 2011-12-26
#7 a 0. 0. 1. 2011-12-27
#8 a 0. 0. 1. 2011-12-28
这似乎很长,我很想知道较短的dplyr
/ tidyverse
选项。
使用rlang::!!!
语法的细微变化
before %>%
rowwise() %>%
mutate(
ndays = max(!!!syms(grep("_days", names(.), value = T))),
tmp = list(ymd(date) + days(0:(ndays - 1)))) %>%
unnest() %>%
group_by(date) %>%
mutate_at(vars(contains("_days")), function(x)
replace(rep(0, length(x)), 1:unique(x), 1)) %>%
ungroup() %>%
select(-ndays, -date) %>%
rename(date = tmp)
答案 2 :(得分:1)
这是tidyverse
library(tidyverse)
before %>%
mutate(ndays = pmax(!!! rlang::syms(names(.)[-(1:2)])),
date = map2(as.Date(date), ndays, ~ seq(.x, .x + .y - 1, by = 'day'))) %>%
rowwise() %>%
mutate_at(vars(matches("_days")),
funs(list(+!is.na(`length<-`(seq(.), ndays))))) %>%
unnest
# A tibble: 8 x 5
# id date apple_days banana_days mango_days
# <fct> <date> <int> <int> <int>
#1 a 2011-12-18 1 1 1
#2 a 2011-12-19 1 1 0
#3 a 2011-12-20 1 1 0
#4 a 2011-12-24 1 1 1
#5 a 2011-12-25 1 1 1
#6 a 2011-12-26 0 0 1
#7 a 2011-12-27 0 0 1
#8 a 2011-12-28 0 0 1