计算变量值的日期推进

时间:2015-07-09 14:13:49

标签: r

这是我的数据:

ID       Day       number.of.day
ID1      Day1          5
ID1      Day1          5
ID1      Day1          5
ID1      Day1          5
ID1      Day1          5
ID1      Day2          4
ID1      Day2          4
ID1      Day2          4
ID1      Day2          4
ID1      Day3          1
ID1      Day4          1
ID2      Day1          2
ID2      Day1          2
ID2      Day2          3
ID2      Day2          3
ID2      Day2          3

更新

我想逐日计算 number.of.day 每个ID 进度,这是预期的结果:< / p>

ID       Day       number.of.day      advance
ID1      Day1          5                NA
ID1      Day1          5                NA
ID1      Day1          5                NA
ID1      Day1          5                NA
ID1      Day1          5                NA
ID1      Day2          4                (4-5)/5
ID1      Day2          4                NA
ID1      Day2          4                NA
ID1      Day2          4                NA
ID1      Day3          1                (1-4)/4
ID1      Day4          1                (1-1)/1
ID2      Day1          2                NA
ID2      Day1          2                NA
ID2      Day2          3                (3-2)/2
ID2      Day2          3                NA
ID2      Day2          3                NA

希望得到你的回复!

3 个答案:

答案 0 :(得分:2)

这是使用data.table

的简单而有效的解决方案
library(data.table)
setDT(df)[!duplicated(df), advance := c(NA, diff(number.of.day)/number.of.day[-.N])]
#      ID  Day number.of.day advance
#  1: ID1 Day1             5      NA
#  2: ID1 Day1             5      NA
#  3: ID1 Day1             5      NA
#  4: ID1 Day1             5      NA
#  5: ID1 Day1             5      NA
#  6: ID1 Day2             4   -0.20
#  7: ID1 Day2             4      NA
#  8: ID1 Day2             4      NA
#  9: ID1 Day2             4      NA
# 10: ID1 Day3             1   -0.75
# 11: ID1 Day4             1    0.00
# 12: ID2 Day1             2    1.00
# 13: ID2 Day1             2      NA
# 14: ID2 Day2             3    0.50
# 15: ID2 Day2             3      NA
# 16: ID2 Day2             3      NA

答案 1 :(得分:2)

library(dplyr)    
newdf <- df %>% group_by(ID) %>% 
mutate(advance = c(NA, head((lead(number.of.day)-number.of.day)/number.of.day, -1)), 
       diff = c(0, diff(as.numeric(Day))))
is.na(newdf$advance) <- newdf$diff == 0L
newdf[,-5]
# Source: local data frame [16 x 4]
# Groups: ID
# 
#     ID  Day number.of.day advance
# 1  ID1 Day1             5      NA
# 2  ID1 Day1             5      NA
# 3  ID1 Day1             5      NA
# 4  ID1 Day1             5      NA
# 5  ID1 Day1             5      NA
# 6  ID1 Day2             4   -0.20
# 7  ID1 Day2             4      NA
# 8  ID1 Day2             4      NA
# 9  ID1 Day2             4      NA
# 10 ID1 Day3             1   -0.75
# 11 ID1 Day4             1    0.00
# 12 ID2 Day1             2      NA
# 13 ID2 Day1             2      NA
# 14 ID2 Day2             3    0.50
# 15 ID2 Day2             3      NA
# 16 ID2 Day2             3      NA

答案 2 :(得分:2)

这是另一个非常简单且只使用base R的建议:

new_day <- which(diff(as.numeric(df$Day))>0)
day_change <- c(diff(df$number.of.day),0)
res <- day_change/df$number.of.day
temp <- res[new_day]
res[res==0] <- NA
res[new_day] <- temp
res <- c(NA,res[-length(res)])
df <- cbind(df,res)
#> df
#    ID  Day number.of.day   res
#1  ID1 Day1             5    NA
#2  ID1 Day1             5    NA
#3  ID1 Day1             5    NA
#4  ID1 Day1             5    NA
#5  ID1 Day1             5    NA
#6  ID1 Day2             4 -0.20
#7  ID1 Day2             4    NA
#8  ID1 Day2             4    NA
#9  ID1 Day2             4    NA
#10 ID1 Day3             1 -0.75
#11 ID1 Day4             1  0.00
#12 ID2 Day1             2  1.00
#13 ID2 Day1             2    NA
#14 ID2 Day2             3  0.50
#15 ID2 Day2             3    NA
#16 ID2 Day2             3    NA