我的数据框有3列(ID,日期,天)。 X列是我想要的。如果有NA,我想将上个月的天数与当月的天数相加。用dplyr可以做到吗?我尝试用for循环来做,但是因为我有超过5M行
需要太多时间 ID date days X
A 2014-01-31 NA NA
A 2014-02-28 NA NA
A 2014-03-31 4 4
A 2014-04-30 NA 34
A 2014-05-31 NA 65
A 2014-06-30 NA 95
A 2014-07-31 NA 126
B 2014-01-31 NA NA
B 2014-02-28 11 11
B 2014-03-31 6 6
B 2014-04-30 NA 36
B 2014-05-31 6 6
B 2014-06-30 NA 36
C 2015-01-31 NA NA
C 2015-02-28 NA NA
答案 0 :(得分:2)
以下是使用tidyverse
,
library(tidyverse)
df %>%
mutate(date = as.Date(date, format = '%Y-%m-%d')) %>%
group_by(ID) %>%
mutate(new = cumsum(!is.na(days))+1) %>%
group_by(ID, new) %>%
mutate(new1 = cumsum(ifelse(is.na(days), as.numeric(diff.difftime(date)), days)),
new1 = replace(new1, new == 1, NA)) %>%
ungroup() %>%
select(-new)
# A tibble: 15 x 5
# ID date days X new1
# <fctr> <date> <int> <int> <dbl>
# 1 A 2014-01-31 NA NA NA
# 2 A 2014-02-28 NA NA NA
# 3 A 2014-03-31 4 4 4
# 4 A 2014-04-30 NA 34 35
# 5 A 2014-05-31 NA 65 65
# 6 A 2014-06-30 NA 95 96
# 7 A 2014-07-31 NA 126 126
# 8 B 2014-01-31 NA NA NA
# 9 B 2014-02-28 11 11 11
#10 B 2014-03-31 6 6 6
#11 B 2014-04-30 NA 36 36
#12 B 2014-05-31 6 6 6
#13 B 2014-06-30 NA 36 36
#14 C 2015-01-31 NA NA NA
#15 C 2015-02-28 NA NA NA