我正在尝试将Plasma_mean变量的新12个月滞后变量添加到我的面板数据中。 PLasma_mean数据在其他观察开始前12个月开始,因此数据集头中其他变量的NA。
ProdGrp timeperiod Plasma_mean Mark.Invest_mean Reps_mean repcost_mean Sales_sum Pcs_vol_sum
1: 1/1/2003 948881 NA NA NA NA NA
2: 2/1/2003 787974 NA NA NA NA NA
3: 3/1/2003 872733 NA NA NA NA NA
4: 4/1/2003 932405 NA NA NA NA NA
5: 5/1/2003 922127 NA NA NA NA NA
---
155: Product A 4/1/2010 1325862 36362.49 1.33 14436.66 168874.9 718
156: Product B 5/1/2010 1253672 53821.38 8.17 14336.67 1989798.9 4549
157: Product A 5/1/2010 1253672 37146.27 1.33 14436.66 152519.5 596
158: Product B 6/1/2010 1334744 69749.48 8.17 14336.67 1978877.4 4612
159: Product A 6/1/2010 1334744 38093.63 1.33 14436.66 164404.0 689
gProt_vol_sum pckg_price_mean g_Prot_price_mean TotalpharmaBiosales_mean dollarized_reps_mean dates
1: NA NA NA NA NA 2003-01-01
2: NA NA NA NA NA 2003-02-01
3: NA NA NA NA NA 2003-03-01
4: NA NA NA NA NA 2003-04-01
5: NA NA NA NA NA 2003-05-01
---
155: 2378.5 191.0250 76.88328 6023500 19200.76 2010-04-01
156: 40109.5 288.6149 49.80379 6135394 30.59 2010-05-01
157: 2204.0 187.4431 76.11616 6135394 19200.76 2010-05-01
158: 41776.0 298.1715 55.74162 8673498 117130.59 2010-06-01
159: 2305.5 190.6980 76.77850 8673498 19200.76 2010-06-01
plasma_lagged
1: NA
2: NA
3: NA
4: NA
5: NA
---
155: NA
156: NA
157: NA
158: NA
159: NA
使用data.frame包,我做了:
lag <- function(Plasma_mean, n = 12L, along_with){
+ index <- match(along_with - n, along_with, incomparable = NA)
+ out <- Plasma_mean[index]
+ attributes(out) <- attributes(Plasma_mean)
+ out
+ }
然后按产品组
将其附加到我的数据集DT[, plasma_lagged := lag(Plasma_mean, 12, along_with = dates), by = ProdGrp]
我在我的数据集的最后一列中得到了plasma_lagged变量。但它似乎没有数据。 (观察155和之后的观察结果)。
如何解决这个问题的任何提示都会很棒。
ħ
答案 0 :(得分:0)
你滞后12天,而不是12个月。试试
library(lubridate)
DT[, plasma_lagged := lag(Plasma_mean, months(12), along_with = dates), by = ProdGrp]
(请提供可重复示例的代码,否则我无法确保其有效。)