我有一个主 data.table,有364行和3列:
Date Weekday Weight
2012-01-01 Monday 100
2013-01-02 Tuesday 200
...
和帮助 data.table,包含7行2列:
Weekday Coefficient
Monday 0.91
Tuesday 0.84
Wednesday 0.99
...
现在我想在主data.table中创建第4列,其中“weight / Coefficient”基于工作日。
Weight_divided <- main[, Weight * help[Weekday==main$Weekday]$Coefficient]
结果如下:
Date Weekday Weight Weight_divided
2012-01-01 Monday 100 91
2013-01-02 Tuesday 200 168
2012-01-03 Wednesday 300 297
2012-01-04 Thursday 400 256
2012-01-05 Friday 500 399
2012-01-06 Saturday 600 410
2012-01-07 Sunday 700 680
2012-01-08 Monday 300 NA <--
2012-01-09 Tuesday 600 NA <--
...
我想问题是两个data.tables的长度是不同的。 有没有办法如何在主data.table操作中引用,它使用较短的data.table?
答案 0 :(得分:2)
使用data.table
library(data.table)
setkey(main, Weekday)[help, Weight_Coef := Weight*Coefficient][order(Date)]
# Weekday Date Weight Weight_Coef
# 1: Monday 2012-01-01 59 53.69
# 2: Tuesday 2012-01-02 45 37.80
# 3: Wednesday 2012-01-03 141 139.59
# 4: Thursday 2012-01-04 104 97.76
# 5: Friday 2012-01-05 133 109.06
#---
#360: Wednesday 2012-12-25 192 190.08
#361: Thursday 2012-12-26 79 74.26
#362: Friday 2012-12-27 39 31.98
#363: Saturday 2012-12-28 175 148.75
#364: Sunday 2012-12-29 134 116.58
set.seed(24)
main <- data.table(Weekday=rep(c('Monday', 'Tuesday', 'Wednesday',
'Thursday', 'Friday', 'Saturday', 'Sunday'), length.out=364),
Date=seq(as.Date('2012-01-01'), length.out=364, by='day'),
Weight=sample(200, 364, replace=TRUE))
help <- data.table(Weekday=c('Monday', 'Tuesday', 'Wednesday',
'Thursday', 'Friday', 'Saturday', 'Sunday'), Coefficient=c(0.91, 0.84,
0.99, 0.94, 0.82, 0.85, 0.87))