我有2个数据表,其dput如下:
dput(x)
structure(list(site = c("A", "B", "C"), date = c("2018-05-06 00:00:05",
"2018-05-06 12:00:00", "2018-05-06 17:00:00")), .Names = c("site",
"date"), row.names = c(NA, -3L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x0000000002570788>)
dput(y)
structure(list(sites = c("A", "A", "B"), vol = c(30, 40, 20),
date = structure(c(1525611600, 1525625640, 1525564805), class = c("POSIXct",
"POSIXt"), tzone = ""), pn = c("sp90", "sp70", "sp98")), .Names = c("sites",
"vol", "date", "pn"), class = c("data.table", "data.frame"), row.names = c(NA,
-3L), .internal.selfref = <pointer: 0x0000000002570788>)
结果数据表应为:
site date vol pn
1: A 2018-05-06 00:00:05 30 sp90
2: A 2018-05-06 12:00:00 40 sp70
3: B 2018-05-06 17:00:00 20 sp98
我需要先检查网站是否匹配,然后检查x $ date是否小于y $ date,然后将vol和pn拉到x。
有什么想法吗?
谢谢。
答案 0 :(得分:0)
您可能会这样-
library(data.table)
setDT(x)[,date:=as.POSIXct(date)]
setDT(y)[,date:=as.POSIXct(date)]
x[, c("vol", "pn","site") := # Assign the below result to new columns
x[y, # join
.(vol, pn,site), # get the column you need
on = .(site = sites, # join conditions
date < date
),
mult = "last"]]
输出-
> x
site date vol pn
1: A 2018-05-06 00:00:05 30 sp90
2: A 2018-05-06 12:00:00 40 sp70
3: B 2018-05-06 17:00:00 20 sp98
编辑-
您在问题中提供的数据集-
x = structure(list(site = c("A", "B", "C"),
date = c("2018-05-06 00:00:05", "2018-05-06 12:00:00", "2018-05-06 17:00:00")),
.Names = c("site","date"), row.names = c(NA, -3L), class = c("data.table", "data.frame"))
y= structure(list(sites = c("A", "A", "B"),
vol = c(30, 40, 20),
date = structure(c(1525611600, 1525625640, 1525564805),
class = c("POSIXct", "POSIXt"), tzone = ""),
pn = c("sp90", "sp70", "sp98")),
.Names = c("sites", "vol", "date", "pn"),
class = c("data.table", "data.frame"),
row.names = c(NA,-3L))