以下是foverlaps(...)
似乎找到不重叠的匹配项的示例。谁能帮我理解我做错了什么?
this post中的问题似乎是在data.table包中使用foverlaps(...)
的绝佳机会。以下数据集来自该帖子。
dinosaurs <- structure(list(GENUS = structure(1:3, .Label = c("Abydosaurus", "Achelousaurus", "Acheroraptor"), class = "factor"), ma_max = c(109, 84.9, 70.6), ma_min = c(94.3, 70.6, 66.043), ma_mid = c(101.65, 77.75, 68.3215)), .Names = c("GENUS", "ma_max", "ma_min", "ma_mid"), class = "data.frame", row.names = c(NA, -3L))
stages <- structure(list(Stage = structure(c(13L, 19L, 17L, 21L, 1L, 4L, 6L, 8L, 16L, 14L, 20L, 7L, 23L, 12L, 5L, 3L, 2L, 10L, 22L, 11L, 18L, 9L, 15L), .Label = c("Aalenian", "Albian", "Aptian", "Bajocian", "Barremian", "Bathonian", "Berriasian", "Callovian", "Campanian", "Cenomanian", "Coniacian", "Hauterivian", "Hettangian", "Kimmeridgian", "Maastrichtian", "Oxfordian", "Pliensbachian", "Santonian", "Sinemurian", "Tithonian", "Toarcian", "Turonian", "Valanginian"), class = "factor"),ma_max = c(201.6, 197, 190, 183, 176, 172, 168, 165, 161, 156, 151, 145.5, 140, 136, 130, 125, 112, 99.6, 93.5, 89.3, 85.8, 83.5, 70.6), ma_min = c(197, 190, 183, 176, 172, 168, 165, 161, 156, 151, 145.5, 140, 136, 130, 125, 112, 99.6, 93.5, 89.3, 85.8, 83.5, 70.6, 66.5), ma_mid = c(199.3, 193.5, 186.5, 179.5, 174, 170, 166.5, 163, 158.5, 153.5, 148.25, 142.75, 138, 133, 127.5, 118.5, 105.8, 96.55, 91.4, 87.55, 84.65, 77.05, 68.05)), .Names = c("Stage", "ma_max", "ma_min", "ma_mid"), class = "data.frame", row.names = c(NA, -23L))
dinosaurs
# GENUS ma_max ma_min ma_mid
# 1 Abydosaurus 109.0 94.300 101.6500
# 2 Achelousaurus 84.9 70.600 77.7500
# 3 Acheroraptor 70.6 66.043 68.3215
head(stages)
# Stage ma_max ma_min ma_mid
# 1 Hettangian 201.6 197 199.3
# 2 Sinemurian 197.0 190 193.5
# 3 Pliensbachian 190.0 183 186.5
# 4 Toarcian 183.0 176 179.5
# 5 Aalenian 176.0 172 174.0
# 6 Bajocian 172.0 168 170.0
目标是找出每个地质阶段存在的恐龙属的数量。
library(data.table) # 1.9.4
setDT(dinosaurs)[,ma_mid:=NULL]
setDT(stages)[,ma_mid:=NULL]
setkey(dinosaurs,ma_min,ma_max)
foverlaps(stages,dinosaurs,type="any",nomatch=0)
# GENUS ma_max ma_min Stage i.ma_max i.ma_min
# 1: Abydosaurus 109.0 94.300 Albian 112.0 99.6
# 2: Abydosaurus 109.0 94.300 Cenomanian 99.6 93.5
# 3: Achelousaurus 84.9 70.600 Coniacian 89.3 85.8
# 4: Achelousaurus 84.9 70.600 Santonian 85.8 83.5
# 5: Acheroraptor 70.6 66.043 Campanian 83.5 70.6
# 6: Achelousaurus 84.9 70.600 Campanian 83.5 70.6
# 7: Acheroraptor 70.6 66.043 Maastrichtian 70.6 66.5
# 8: Achelousaurus 84.9 70.600 Maastrichtian 70.6 66.5
这大多是正确的,但请看第3行。这似乎断言,从85.8到8930万年前的Cenomanian阶段与Achelousaurus重叠,后者生活在70.6到8490万年前。我错过了什么?
答案 0 :(得分:2)
在1.9.5上,我明白了:
# GENUS ma_max ma_min Stage i.ma_max i.ma_min
# 1: Abydosaurus 109.0 94.300 Albian 112.0 99.6
# 2: Abydosaurus 109.0 94.300 Cenomanian 99.6 93.5
# 3: Achelousaurus 84.9 70.600 Santonian 85.8 83.5
# 4: Acheroraptor 70.6 66.043 Campanian 83.5 70.6
# 5: Achelousaurus 84.9 70.600 Campanian 83.5 70.6
# 6: Acheroraptor 70.6 66.043 Maastrichtian 70.6 66.5
# 7: Achelousaurus 84.9 70.600 Maastrichtian 70.6 66.5
很可能在this commit中的1.9.5中修复了浮点错误。如果您也可以验证这一点,那就太好了。