这是一个来自我偶然遇到的早期线程的查询,两个表DT1和DT2
DT1
Country State City Start End
1 IN Telangana Hyderabad 100 200
2 IN Maharashtra Pune 300 400
3 IN Haryana Gurgaon 500 600
4 IN Maharashtra Pune 700 800
5 IN Gujarat Ahmedabad 900 1000
DT2 with 7 rows
ID No
1 157
2 346
3 389
4 453
5 562
6 9874
7 98745
使用此代码加入时,
DT2[DT1, on=.(No>Start,No<End), ]
生成此输出,包含6行
ID No No.1 Country State City
1: 1 100 200 IN Telangana Hyderabad
2: 2 300 400 IN Maharashtra Pune
3: 3 300 400 IN Maharashtra Pune
4: 5 500 600 IN Haryana Gurgaon
5: NA 700 800 IN Maharashtra Pune
6: NA 900 1000 IN Gujarat Ahmedabad
我可以理解对应于ID 6和7(rownumbers 5和6)的NA,但是为什么缺少对应于ID 4的NA。 ID4有453否,映射到DT1中没有范围,应该抛出NA?
EDIT1:提供代码来创建数据集
DT1<-
structure(list(Country = structure(c(1L, 1L, 1L, 1L, 1L), .Label = "IN", class = "factor"),
State = structure(c(4L, 3L, 2L, 3L, 1L), .Label = c("Gujarat",
"Haryana", "Maharashtra", "Telangana"), class = "factor"),
City = structure(c(3L, 4L, 2L, 4L, 1L), .Label = c("Ahmedabad",
"Gurgaon", "Hyderabad", "Pune"), class = "factor"), Start = c(100L,
300L, 500L, 700L, 900L), End = c(200L, 400L, 600L, 800L,
1000L)), .Names = c("Country", "State", "City", "Start",
"End"), class = c("data.table", "data.frame"))
DT2<-
structure(list(ID = 1:7, No = c(157L, 346L, 389L, 453L, 562L,
9874L, 98745L)), .Names = c("ID", "No"), class = c("data.table",
"data.frame"))