我的数据与此类似,只是dt1
有 2900万行,dt2
只有15行(不是1500万)。< / p>
dt1 <- data.table(ID=1:4,City=c("Charlotte","DC","Salem","Boston"))
dt2 <- data.table(Birds=c("Saker","Peregrine","Barbary","Prarie","Golden","Coopers","Canary","Finch"),BirdType=c("Falcon","Falcon","Falcon","Falcon","Eagle","Hawk","Breakfast","Breakfast"))
这样的输出:
> dt1
ID City
1: 1 Charlotte
2: 2 DC
3: 3 Salem
4: 4 Boston
> dt2
Birds BirdType
1: Saker Falcon
2: Peregrine Falcon
3: Barbary Falcon
4: Prarie Falcon
5: Golden Eagle
6: Coopers Hawk
7: Canary Breakfast
8: Finch Breakfast
我想合并两个data.tables,其中每行dt1与dt2的所有行组合,最终给出一个带有32行的data.table,输出如下:
> dtMerged
ID City Birds BirdType
1: 1 Charlotte Saker Falcon
2: 1 Charlotte Peregrine Falcon
3: 1 Charlotte Barbary Falcon
4: 1 Charlotte Prarie Falcon
5: 1 Charlotte Golden Eagle
6: 1 Charlotte Coopers Hawk
7: 1 Charlotte Canary Breakfast
8: 1 Charlotte Finch Breakfast
9: 2 DC Saker Falcon
10: 2 DC Peregrine Falcon
11: 2 DC Barbary Falcon
12: 2 DC Prarie Falcon
13: 2 DC Golden Eagle
14: 2 DC Coopers Hawk
15: 2 DC Canary Breakfast
16: 2 DC Finch Breakfast
17: 3 Salem Saker Falcon
18: 3 Salem Saker Falcon
etc...
如何最好地实现这一点的任何想法将不胜感激。
我在Windows 7 PC上使用data.table
版本1.10.4。感谢。
答案 0 :(得分:1)
正如@akrun评论的那样,交叉连接似乎是解决问题的方法之一。为了实现它,我在this Stack Overflow post中引用了@jangorecki CJ.dt
的一个非常简洁的函数来获得所需的解决方案:
CJ.dt = function(X,Y) {
stopifnot(is.data.table(X),is.data.table(Y))
k = NULL
X = X[, c(k=1, .SD)]
setkey(X, k)
Y = Y[, c(k=1, .SD)]
setkey(Y, NULL)
X[Y, allow.cartesian=TRUE][, k := NULL][]
}
new_df <- CJ.dt(dt1, dt2)
setorder(new_df, ID)
以下是重新订购后的完整输出:
> new_df
ID City Birds BirdType
1: 1 Charlotte Saker Falcon
2: 1 Charlotte Peregrine Falcon
3: 1 Charlotte Barbary Falcon
4: 1 Charlotte Prarie Falcon
5: 1 Charlotte Golden Eagle
6: 1 Charlotte Coopers Hawk
7: 1 Charlotte Canary Breakfast
8: 1 Charlotte Finch Breakfast
9: 2 DC Saker Falcon
10: 2 DC Peregrine Falcon
11: 2 DC Barbary Falcon
12: 2 DC Prarie Falcon
13: 2 DC Golden Eagle
14: 2 DC Coopers Hawk
15: 2 DC Canary Breakfast
16: 2 DC Finch Breakfast
17: 3 Salem Saker Falcon
18: 3 Salem Peregrine Falcon
19: 3 Salem Barbary Falcon
20: 3 Salem Prarie Falcon
21: 3 Salem Golden Eagle
22: 3 Salem Coopers Hawk
23: 3 Salem Canary Breakfast
24: 3 Salem Finch Breakfast
25: 4 Boston Saker Falcon
26: 4 Boston Peregrine Falcon
27: 4 Boston Barbary Falcon
28: 4 Boston Prarie Falcon
29: 4 Boston Golden Eagle
30: 4 Boston Coopers Hawk
31: 4 Boston Canary Breakfast
32: 4 Boston Finch Breakfast