我一直试图弄清楚这一点并没有成功。我已经尝试过使用data.table,merge和apply函数,但是我是初学者,可能错过了一些东西。基本上我想使用来自一个数据集的datetime(“dt”)信息和唯一事件标识符(“uid”)来创建一个列,告诉我条件是否得到满足以及发生了哪个“sid”。复杂的是,对于给定的“dt”和“uid”,可能会出现不止一个“sid”并且我想知道这一点。我的数据集如下所示......
> head(D1)
X dt tid rid sid uid nutm eutm
1 1 2013-09-06 08:55:00 A69-1601-27 VR2W-11 BUIS_09 1 1932024 389346.7
2 2 2013-09-06 08:55:00 A69-1601-27 VR2W-12 BUIS_08 1 1932024 389346.7
3 3 2013-09-06 08:56:00 A69-1601-27 VR2W-11 BUIS_09 1 1932024 389346.7
4 4 2013-09-06 08:57:00 A69-1601-27 VR2W-11 BUIS_09 1 1932024 389346.7
5 5 2013-09-06 08:57:00 A69-1601-27 VR2W-12 BUIS_08 1 1932024 389346.7
6 6 2013-09-06 08:57:00 A69-1601-27 VR2W-11 BUIS_09 1 1932024 389346.7
> head(D2)
X dt uid
1 1 09/06/2013 08:57:00 1
2 2 09/06/2013 08:58:00 1
3 3 09/06/2013 08:59:00 1
4 4 09/06/2013 09:00:00 1
5 5 09/06/2013 09:01:00 1
6 6 09/06/2013 09:02:00 1
我想运行一个函数,从D2的第一行开始查看D1的每一行的“uid”和“dt”,并创建一个列,说明是否满足这些条件以及相关的“sid”条件得到满足。所以我的输出看起来像这样...
head(stimes)
X dt uid freq sid tid rid nutm eutm
1 1 09/06/2013 08:57:00 1 T BUIS_09 A69-1601-27 VR2W-11 1932024 389346.7
1 2 09/06/2013 08:57:00 1 T BUIS_08 A69-1601-27 VR2W-12 1967024 389346.7
更新:dput(head(D1))如下所示。这有点困难,因为有大量独特的日期时间,但总的来说......
"2014-05-03 09:12:00", "2014-05-03 09:13:00", "2014-05-03 09:14:00",
"2014-05-03 09:15:00", "2014-05-03 09:16:00", "2014-05-03 09:17:00",
"2014-05-03 09:18:00", "2014-05-03 09:19:00", "2014-05-03 09:20:00",
"2014-05-03 09:21:00", "2014-05-03 09:22:00"), class = "factor"),
tid = structure(c(2L, 2L, 3L, 2L, 2L, 2L), .Label = c("A69-1303-54",
"A69-1601-27", "A69-1601-275", "A69-1601-31"), class = "factor"),
rid = structure(c(30L, 30L, 30L, 30L, 30L, 30L), .Label = c("VR2W-102913",
"VR2W-914", "VR2W-916", "VR2W-102917", "VR2W-210",
"VR2W-710", "VR2W-711", "VR2W-103821", "VR2W-088",
"VR2W-620", "VR2W-621", "VR2W-104622", "VR2W-938",
"VR2W-320", "VR2W-145", "VR2W-110204", "VR2W-205",
"VR2W-207", "VR2W-208", "VR2W-110210", "VR2W-634",
"VR2W-118", "VR2W-119", "VR2W-114120", "VR2W-121",
"VR2W-122", "VR2W-123", "VR2W-114124", "VR2W-125",
"VR2W-498", "VR2W-499", "VR2W-115500", "VR2W-501",
"VR2W-502", "VR2W-503", "VR2W-115504", "VR2W-925",
"VR2W-272", "VR2W-273", "VR2W-120967", "VR2W-585"
), class = "factor"), sid = structure(c(9L, 9L, 9L, 9L, 9L,
9L), .Label = c("BUIS_01", "BUIS_02", "BUIS_03", "BUIS_04",
"BUIS_05", "BUIS_06", "BUIS_07", "BUIS_08", "BUIS_09", "BUIS_10",
"BUIS_11", "BUIS_12", "BUIS_13", "BUIS_14", "BUIS_15", "BUIS_16",
"BUIS_17", "BUIS_18", "BUIS_19", "BUIS_20", "BUIS_22", "BUIS_26",
"BUIS_27", "BUIS_31", "BUIS_33", "BUIS_34", "BUIS_35", "BUIS_36",
"BUIS_37", "BUIS_38", "BUIS_41", "BUIS_42", "BUIS_43", "BUIS_46"
), class = "factor"), uid = c(NA, NA, NA, 1L, 1L, 1L), nutm = c(NA,
NA, NA, 1967024.456, 1967024.456, 1967024.456), eutm = c(NA,
NA, NA, 328346.6957, 328346.6957, 328346.6957)), .Names = c("X",
"dt", "tid", "rid", "sid", "uid", "nutm", "eutm"), row.names = c(NA,
6L), class = "data.frame")
答案 0 :(得分:2)
如果这些是data.tables(它们不是),
setkey(D1,dt,uid)
setkey(D2,dt,uid)
D1[D2,nomatch=0]
可能会这样做。