RProf透露,我执行的以下操作相当缓慢:
stockHistory[.(p), stock:=stockHistory[.(p), stock] - (backorderedDemands[.(p-1),backlog] - backorderedDemands[.(p),backlog])]
我想这是因为减法
backorderedDemands[.(p-1),backlog] - backorderedDemands[.(p),backlog]
有没有办法加快这项行动?
。(p)将data.table子集设置为句点p,。(p-1)子集上一个句点(参见下面的示例数据)。在这里应用某种diff()可能会更快吗?不过,我不知道该怎么做。
示例数据:
backorderedDemands<-CJ(period=1:1000, articleID=letters[1:10], backlog=0)[,backlog:=round(runif(10000)*42,0)]
setkey(backorderedDemands,period, articleID)
stockHistory<-CJ(period=1:1000, articleID=letters[1:10], stock=0)[,stock:=round(runif(10000)*42+66,0)]
setkey(stockHistory,period, articleID)
答案 0 :(得分:3)
您可以先在backorderedDemands
中计算差异列。
backorderedDemands[, diff := c(NA, -diff(backlog)), by=articleID]
此外,没有必要使用stockHistory[.(p), stock]
。只需使用stock
即可。
stockHistoryNew[.(p), stock:=stock - backorderedDemands[.(p), diff]]
答案 1 :(得分:1)
如果您想计算数据的第一个差异,可以像下面这样做。它很快......我一步步计算。
library(data.table)
library(dplyr)
set.seed(1)
backorderedDemands <-
CJ(period = 1:1000,
articleID = letters[1:10],
backlog = 0)[,backlog:= round(runif(10000) * 42, 0)]
stockHistory <-
CJ(period = 1:1000,
articleID = letters[1:10],
stock = 0)[, stock:= round(runif(10000) * 42 + 66, 0)]
merge(stockHistory, backorderedDemands,
by = c("period", "articleID")) %>%
group_by(articleID) %>%
mutate(lag_backlog = lag(backlog, 1),
my_backlog_diff = backlog - lag_backlog,
my_diff = stock + my_backlog_diff) %>%
as.data.frame(.) %>%
head(., 20)
period articleID stock backlog lag_backlog my_backlog_diff my_diff
1 1 a 69 11 NA NA NA
2 1 b 94 16 NA NA NA
3 1 c 97 24 NA NA NA
4 1 d 71 38 NA NA NA
5 1 e 68 8 NA NA NA
6 1 f 71 38 NA NA NA
7 1 g 103 40 NA NA NA
8 1 h 101 28 NA NA NA
9 1 i 102 26 NA NA NA
10 1 j 67 3 NA NA NA
11 2 a 71 9 11 -2 69
12 2 b 89 7 16 -9 80
13 2 c 71 29 24 5 76
14 2 d 96 16 38 -22 74
15 2 e 96 32 8 24 120
16 2 f 99 21 38 -17 82
17 2 g 92 30 40 -10 82
18 2 h 87 42 28 14 101
19 2 i 85 16 26 -10 75
20 2 j 67 33 3 30 97