没有键合并两个表

时间:2018-08-22 14:29:04

标签: r merge

我需要合并两个表(下面)以创建所需的结果(也在下面)。我知道如何在SQL中执行此操作,但在这种情况下我做不到。我如何合并两个表?

LabStations表

   StationNumber
1:             1
2:             2
3:             3
4:             4
5:             5

日历表:

                 dates
1: 2010-01-01 06:00:00
2: 2010-01-01 07:00:00
3: 2010-01-01 08:00:00
4: 2010-01-01 09:00:00
5: 2010-01-01 10:00:00
6: 2010-01-01 11:00:00

所需结果:

注意:在sql中,我将获得与以下查询相同的结果,        但在这种情况下,我无法将sql用于日历。

select s.StationNumber, c.dates
from LabStations, Calendar

所需结果表:

StationNumber      dates  
1:             1   2010-01-01 06:00:00
2:             1   2010-01-01 07:00:00
3:             1   2010-01-01 08:00:00
4:             1   2010-01-01 09:00:00
5:             1   2010-01-01 10:00:00
6:             2   2010-01-01 06:00:00
7:             2   2010-01-01 07:00:00
8:             2   2010-01-01 08:00:00
9:             2   2010-01-01 09:00:00
10:            2   2010-01-01 10:00:00
...
21:            5   2010-01-01 06:00:00
22:            5   2010-01-01 07:00:00
23:            5   2010-01-01 08:00:00
24:            5   2010-01-01 09:00:00
25:            5   2010-01-01 10:00:00

用于重现表的代码:

#Lab stations come from a database so they really aren't sequential.
LabStations <- list(1:5)
setDT(LabStations)
names(LabStations) <- c("StationNumber")


#Dates are really 5 years, not five hours.
dates <- seq.POSIXt(ISOdatetime(2010,1,1,0,0,0,tz="America/Chicago"), ISOdatetime(2010,1,1,4,0,0,tz="America/Chicago"), by="hour")
Calendar <- data.frame(dates)
setDT(Calendar)
attributes(Calendar$dates)$tzone <- "UTC"

#MergedTable <- ????

1 个答案:

答案 0 :(得分:0)

您可以使用tidyr中的crossing函数:

library(tidyr)
mergedtable <- crossing(LabStations, Calendar)

    StationNumber               dates
 1:             1 2010-01-01 06:00:00
 2:             1 2010-01-01 07:00:00
 3:             1 2010-01-01 08:00:00
 4:             1 2010-01-01 09:00:00
 5:             1 2010-01-01 10:00:00
 6:             2 2010-01-01 06:00:00
 7:             2 2010-01-01 07:00:00
 8:             2 2010-01-01 08:00:00
 9:             2 2010-01-01 09:00:00
10:             2 2010-01-01 10:00:00
11:             3 2010-01-01 06:00:00
12:             3 2010-01-01 07:00:00
13:             3 2010-01-01 08:00:00
14:             3 2010-01-01 09:00:00
15:             3 2010-01-01 10:00:00
16:             4 2010-01-01 06:00:00
17:             4 2010-01-01 07:00:00
18:             4 2010-01-01 08:00:00
19:             4 2010-01-01 09:00:00
20:             4 2010-01-01 10:00:00
21:             5 2010-01-01 06:00:00
22:             5 2010-01-01 07:00:00
23:             5 2010-01-01 08:00:00
24:             5 2010-01-01 09:00:00
25:             5 2010-01-01 10:00:00
    StationNumber               dates

Base R解决方案:

merged <- expand.grid(LabStations$StationNumber, Calendar$dates, stringsAsFactors = FALSE)
merged <- merged[order(merged$Var1, method = "radix"), ]
merged <- setNames(merged, c("StationNumber", "dates"))

head(merged)
   StationNumber               dates
1              1 2010-01-01 06:00:00
6              1 2010-01-01 07:00:00
11             1 2010-01-01 08:00:00
16             1 2010-01-01 09:00:00
21             1 2010-01-01 10:00:00
2              2 2010-01-01 06:00:00