我想在data.table
进行左连接,我想在panelFull
的基础上加入panel
和OutletID
。
从panel
我希望CellID
列插入panelFull
:
> panel[1:15,]
Period CellID OutletID ACV
1: 215 1268 M44600 9563317
2: 215 1268 M44800 8966339
3: 215 1268 M45100 7043924
4: 215 1268 M45200 9013918
5: 215 1268 M45300 10009468
6: 215 1268 M46900 22148703
7: 215 1268 M48400 18661734
8: 215 1268 M51000 8531347
9: 215 1268 M51500 9125734
10: 215 1268 M51600 8575727
11: 215 1268 M53700 12148614
12: 215 1268 M57000 9678589
13: 215 1268 M59400 17261166
14: 215 1268 M60200 7939758
15: 215 1268 M60700 6840897
> panelFull[1:15,]
OutletID pno
1: CP0001 204
2: CP0001 205
3: CP0001 206
4: CP0001 207
5: CP0001 208
6: CP0001 209
7: CP0001 210
8: CP0001 211
9: CP0001 212
10: CP0001 213
11: CP0001 214
12: CP0001 215
13: CP0006 204
14: CP0006 205
15: CP0006 206
我想要类似的东西:
OutletID pno CellID
如何使用data.table
?
答案 0 :(得分:5)
以下内容应该为您提供所需的结果:
panelFull[panel, CellID:=CellID, on="OutletID"]
对于提供的数据集,这将导致列只包含NA
- 值,因为两个数据集之间没有匹配的OutletID
。因此,我略微调整了panelFull
数据集的内容(您可以在此答案的末尾找到dput
)。然后,加入结果为:
> panelFull
OutletID pno CellID
1: CP0001 204 NA
2: CP0001 205 NA
3: CP0001 206 NA
4: CP0001 207 NA
5: CP0001 208 NA
6: CP0001 209 NA
7: CP0001 210 NA
8: CP0001 211 NA
9: CP0001 212 NA
10: CP0001 213 NA
11: CP0001 214 NA
12: CP0001 215 NA
13: CP0006 204 NA
14: CP0006 205 NA
15: CP0006 206 NA
16: M60700 215 1268
使用过的数据:
panelFull <- structure(list(OutletID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L), .Label = c("CP0001", "CP0006", "M60700"), class = "factor"), pno = c(204L, 205L, 206L, 207L, 208L, 209L, 210L, 211L, 212L, 213L, 214L, 215L, 204L, 205L, 206L, 215L)), .Names = c("OutletID", "pno"), class = c("data.table", "data.frame"), row.names = c(NA, -16L))
panel <- structure(list(Period = c(215L, 215L, 215L, 215L, 215L, 215L, 215L, 215L, 215L, 215L, 215L, 215L, 215L, 215L, 215L), CellID = c(1268L, 1268L, 1268L, 1268L, 1268L, 1268L, 1268L, 1268L, 1268L, 1268L, 1268L, 1268L, 1268L, 1268L, 1268L), OutletID = structure(1:15, .Label = c("M44600", "M44800", "M45100", "M45200", "M45300", "M46900", "M48400", "M51000", "M51500", "M51600", "M53700", "M57000", "M59400", "M60200", "M60700"), class = "factor"), ACV = c(9563317L, 8966339L, 7043924L, 9013918L, 10009468L, 22148703L, 18661734L, 8531347L, 9125734L, 8575727L, 12148614L, 9678589L, 17261166L, 7939758L, 6840897L)), .Names = c("Period", "CellID", "OutletID", "ACV"), class = c("data.table", "data.frame"), row.names = c(NA, -15L))
答案 1 :(得分:0)
我试过这个并且它正在工作
setkey(panelFull,CELLDEF)
setkey(cells,CellID)
panelFull = cells[panelFull]