查看第4个data.table vignette here(二级索引和自动索引),它看起来像示例2f。返回错误的月份标签。
flights <- read.csv(url("https://github.com/arunsrinivasan/flights/wiki/NYCflights14/flights14.csv"))
示例给出:
> head(flights["JFK", max(dep_delay), keyby = month, on = "origin"])
month V1
1: 1 881
2: 1 1014
3: 1 920
4: 1 1241
5: 1 853
6: 1 798
但是在不使用二级索引的情况下复制它会产生:
> head(flights[origin == "JFK", max(dep_delay), keyby = month])
month V1
1: 1 881
2: 2 1014
3: 3 920
4: 4 1241
5: 5 853
6: 6 798
通过使用dep_delay == 1014
查找行可以看到错误> flights[month =="1" & dep_delay == 1014]
Empty data.table (0 rows) of 17 cols: year,month,day,dep_time,dep_delay,arr_time...
> flights[month =="2" & dep_delay == 1014]
year month day dep_time dep_delay arr_time arr_delay cancelled carrier tailnum flight origin dest air_time distance hour min
1: 2014 2 21 844 1014 1151 1007 0 DL N983DL 2459 JFK MCO 139 944 8 44
这是示例代码中的错误,还是data.table缺陷?