引用刚刚在dplyr中创建的列在tbl_dfs上工作但在data.tables上没有。这是预期的吗?我在文档中没有看到关于此的说明。
以下是可重现的例子:
library("hflights")
library("plyr")
library("dplyr")
library("data.table")
hflights_df <- tbl_df(hflights)
summarise(hflights_df,
delay = mean(DepDelay, na.rm = TRUE),
delay2 = 2*delay)
## Source: local data frame [1 x 2]
##
## delay delay2
## 1 9.444951 18.8899
hflights_dt <- data.table(hflights_df)
summarise(hflights_dt,
delay = mean(DepDelay, na.rm = TRUE),
delay2 = 2*delay)
## Error in eval(expr, envir, enclos) : object 'delay' not found
这是我的sessionInfo:
sessionInfo()
## R version 3.0.2 (2013-09-25)
## Platform: x86_64-apple-darwin10.8.0 (64-bit)
##
## locale:
## [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] data.table_1.9.2 dplyr_0.2 plyr_1.8.1 hflights_0.1
## [5] andRstuff_1.0 devtools_1.5.0.99 tikzDevice_0.7.0 filehash_2.2-2
##
## loaded via a namespace (and not attached):
## [1] assertthat_0.1 compiler_3.0.2 digest_0.6.4 evaluate_0.5.3 grid_3.0.2
## [6] httr_0.3 memoise_0.1 parallel_3.0.2 Rcpp_0.11.1 RCurl_1.95-4.1
## [11] reshape2_1.2.2 stringr_0.6.2 tools_3.0.2 whisker_0.3-2
编辑: 这有效(类似于jazzurro和KFB的建议):
summarise(tbl_df(hflights_dt),
delay = mean(DepDelay, na.rm = TRUE),
delay2 = 2*delay)
## Source: local data frame [1 x 2]
##
## delay delay2
## 1 9.444951 18.8899
但这不起作用:
summarise(tbl_dt(hflights_dt),
delay = mean(DepDelay, na.rm = TRUE),
delay2 = 2*delay)
## Error in eval(expr, envir, enclos) : object 'delay' not found
答案 0 :(得分:0)
> hflights_df <- tbl_df(hflights)
> hflights_dt <- as.data.table(hflights_df)
> summarise(hflights_dt,
+ delay = mean(DepDelay, na.rm = TRUE),
+ delay2 = 2*delay)
Source: local data frame [1 x 2]
delay delay2
1 9.444951 18.8899