我有这样的数据:
DT = data.table(Brand = c('Apple', 'Apple'),
Time1 = c('2015-11', '2016-01'),
value1 = c(119.7268, 336.8033),
vaule2 = c(3380, 7710))
我想生成以下新数据:
Brand Time1 Time2 LapseMonth value1 value2
Apple 2015-11-01 2015-11-01 0 119.7268 3380
Apple 2015-11-01 2015-12-01 1 286.2842 0
Apple 2015-11-01 2016-01-01 2 286.2842 0
Apple 2015-11-01 2016-02-01 3 267.8142 0
Apple 2015-11-01 2016-03-01 4 286.2842 0
Apple 2015-11-01 2016-04-01 5 277.0492 0
Apple 2015-11-01 2016-05-01 6 286.2842 0
Apple 2015-11-01 2016-06-01 7 277.0492 0
Apple 2015-11-01 2016-07-01 8 286.2842 0
Apple 2015-11-01 2016-08-01 9 286.2842 0
Apple 2015-11-01 2016-09-01 10 277.0492 0
Apple 2015-11-01 2016-10-01 11 286.2842 0
Apple 2015-11-01 2016-11-01 12 157.3224 0
Apple 2016-01-01 2016-01-01 0 336.8033 7710
Apple 2016-01-01 2016-02-01 1 610.9016 0
Apple 2016-01-01 2016-03-01 2 653.0328 0
Apple 2016-01-01 2016-04-01 3 631.9672 0
Apple 2016-01-01 2016-05-01 4 653.0328 0
Apple 2016-01-01 2016-06-01 5 631.9672 0
Apple 2016-01-01 2016-07-01 6 653.0328 0
Apple 2016-01-01 2016-08-01 7 653.0328 0
Apple 2016-01-01 2016-09-01 8 631.9672 0
Apple 2016-01-01 2016-10-01 9 653.0328 0
Apple 2016-01-01 2016-11-01 10 631.9672 0
Apple 2016-01-01 2016-12-01 11 653.0328 0
我在这里解释新数据:
1.我将生成2个新列(Time2
& LapseMonth
)
我计算value1
3.最重要的是:
如果Time1
2015 且LapseMonth
12 ,value1 = value2 * days_in_month(Time2) / 366
- 原始值1。
见上文,157.3224 = 3380 * 30/366 - 119.7268。
这是我的代码:
DT[ , Time1 := as.Date(paste(Time1, 01, sep = "-"), "%Y/%m/%d")]
DT[ , rep := ifelse(year(Time1)==2016, 12-month(Time1)+1, 13)][rep(1:.N,rep)]
DT[ , LapseMonth := seq_len(.N)-1, by = Brand,Time1,value2) ]
DT[ , Time2:= Time1 - days(mday(Time1)-1) + months(LapseMonth)]
DT[ , value1 := ifelse(Time1==Time2,value1,value2*days_in_month(Time2)/366)]
DT[ , value2 := ifelse(Time1==Time2,value2,0)]
ifelse
2015 时,我不知道value1
如何使用Time1
& LapseMonth
12
任何的想法?
DT[ , value1:=if(Time1==Time2 & LapseMonth==12) value2*days_in_month(time2)/366-value1]
然而,我收到了一些警告:
Warning message:
In if (PurshasedDate == EXPMTH & LapseMonth == 12) WP * days_in_month(EXPMTH)/366 - :
the condition has length > 1 and only the first element will be used
答案 0 :(得分:3)
我仍然不理解输入到输出的映射,但无论如何。您的主要问题是如何使用ifelse
来确定当月的天数。答案是:不要。
相反,只需使用查找表直接获取天数:
monthdays = data.table(month = sprintf('%02d', 1:12),
ndays = c(31, 29, 31, 30, 31, 30,
31, 31, 30, 31, 30, 31),
key = 'month')
DT[ , {
Time2 = seq.Date(Time1, as.Date('2016-12-01'), by = 'month')
Time2 = Time2[seq_len(min(13L, length(Time2)))]
LapseMonth = seq_along(Time2) - 1L
value1 = value2*monthdays[format(Time2, '%m'), ndays]/366 - value1
.(Brand = Brand, Time2 = Time2,
LapseMonth = LapseMonth,
value1 = value1,
value2 = c(value2, rep(0, length(LapseMonth) - 1)))
}, by = Time1]
# Time1 Brand Time2 LapseMonth value1 value2
# 1: 2015-11-01 Apple 2015-11-01 0 157.3224 3380
# 2: 2015-11-01 Apple 2015-12-01 1 166.5574 0
# 3: 2015-11-01 Apple 2016-01-01 2 166.5574 0
# 4: 2015-11-01 Apple 2016-02-01 3 148.0874 0
# 5: 2015-11-01 Apple 2016-03-01 4 166.5574 0
# 6: 2015-11-01 Apple 2016-04-01 5 157.3224 0
# 7: 2015-11-01 Apple 2016-05-01 6 166.5574 0
# 8: 2015-11-01 Apple 2016-06-01 7 157.3224 0
# 9: 2015-11-01 Apple 2016-07-01 8 166.5574 0
# 10: 2015-11-01 Apple 2016-08-01 9 166.5574 0
# 11: 2015-11-01 Apple 2016-09-01 10 157.3224 0
# 12: 2015-11-01 Apple 2016-10-01 11 166.5574 0
# 13: 2015-11-01 Apple 2016-11-01 12 157.3224 0
# 14: 2016-01-01 Apple 2016-01-01 0 316.2295 7710
# 15: 2016-01-01 Apple 2016-02-01 1 274.0983 0
# 16: 2016-01-01 Apple 2016-03-01 2 316.2295 0
# 17: 2016-01-01 Apple 2016-04-01 3 295.1639 0
# 18: 2016-01-01 Apple 2016-05-01 4 316.2295 0
# 19: 2016-01-01 Apple 2016-06-01 5 295.1639 0
# 20: 2016-01-01 Apple 2016-07-01 6 316.2295 0
# 21: 2016-01-01 Apple 2016-08-01 7 316.2295 0
# 22: 2016-01-01 Apple 2016-09-01 8 295.1639 0
# 23: 2016-01-01 Apple 2016-10-01 9 316.2295 0
# 24: 2016-01-01 Apple 2016-11-01 10 295.1639 0
# 25: 2016-01-01 Apple 2016-12-01 11 316.2295 0
# Time1 Brand Time2 LapseMonth value1 value2
答案 1 :(得分:-1)
我自己弄清楚,我在这里发布答案。也许其他用户将来会有类似的问题。
DT = data.table(Brand = c('Apple', 'Apple'),
Time1 = c('2015-11', '2016-01'),
value1 = c(119.7268, 336.8033),
value2 = c(3380, 7710))
DT$Time1 <- ymd( paste( DT$Time1, 01, sep = "-"))
DT<-as.data.table(DT)
DT <-DT[,Time1:=as.Date(Time1,"%Y/%m/%d")]
DT <- DT[,rep := ifelse(year(Time1)==2016, 12-month(Time1)+1, 13)][rep(1:.N,rep)]
DT <- DT[, LapseMonth := seq_len(.N)-1, by =. (Brand,Time1,value2) ]
DT <- DT[, Time2:= Time1 - days(mday(Time1)-1) + months(LapseMonth)]
DT <- DT[, value1 := ifelse(Time1==Time2,value1,ifelse(LapseMonth==12, value2*days_in_month(Time2)/366-value1, value2*days_in_month(Time2)/366))]
DT <- DT[, value2 := ifelse(Time1==Time2,value2,0)]
因此,我得到了结果。