您好我有data.table格式的数据集。让我们称之为dtA:
Date Company Data
200012 compA 3
200012 compB 4
200012 compC 7
200101 compA 1
200101 compB 2
200101 compC 3
200102 compA 2
200102 compB 4
200102 compC 1
我想为每个日期做,对于每个公司,我想在两者之间进行减法。
所以最终结果将是:
Date Company Data
200102 compA 1
200102 compB 2
200102 compC -2
200101 compA -2
200101 compB -2
200101 compC -4
我写了一个冗长且笨重的代码来执行此操作,无论如何我可以使用data.table中的lapply函数吗? 我似乎不明白在data.table中如何使用lapply。当lapply没有迭代器时,我无法遍历日期......
这是我的代码:
date=as.data.table(c("200012","200101", "200102"))
comp=as.data.table(c("compA","compB","compC"))
result=NA
date=date[-order(date)] #so it go decenting order
for (i in 1:(nrow(date)-1)){
d1=date[i]
d2=date[i+1]
dtA1=dtA[Date==d1][order(Company)]
dtA2=dtA[Date==d2][order(Company)]
ans.temp=merge(dtA1,dtA2, on=c("Date","Company"))
ans.temp=ans.temp[,Data := Data.x - Data.y, by="Company"]
ans.temp=ans.temp[,-c(3,4,5)]
if (is.NA(result)[1]){
result=ans.temp
} else{
result=rbind(result, ans.temp)
}
}
答案 0 :(得分:3)
您不需要循环或应用,只需使用公司组的diff
:
setkey(dtA, Date, Company)
dtA[, list(diff = diff(Data), Date = Date[-1]), by = Company]
# Company diff Date
# 1: compA -2 200101
# 2: compA 1 200102
# 3: compB -2 200101
# 4: compB 2 200102
# 5: compC -4 200101
# 6: compC -2 200102
使用此数据:
dtA = fread("Date Company Data
200012 compA 3
200012 compB 4
200012 compC 7
200101 compA 1
200101 compB 2
200101 compC 3
200102 compA 2
200102 compB 4
200102 compC 1")