R data.table在函数

时间:2016-01-07 09:37:43

标签: r data.table

我的问题是将数据表传递给设置函数,然后在退出函数时,此函数中的设置键(带setkeyv)将丢失。

请一位DT专家解释我做错了什么。我期待变量dt将键集保持在函数中,因为我认为数据表很难通过引用使用。 我对范围界定有什么误解,我该如何解决?(我使用的是data.table 1.9.6,R 3.2.2)

library(data.table)

# test data here for Stack overflow
#   yes I know one can key the data.table here
#   but normally read my data from csv file
dt <- data.table(date = c("2015-12-31","2016-01-01"), 
                 class = c("a","b"),
                 units = c(1000, 200))
tables()
# no key as you expect as not set
#NAME         NROW NCOL MB COLS                                                                         KEY
#[1,] dt              2    3  1 date,class,units

然后我想清理函数中导入的csv数据,然后键入表格。

PrepareData <- function(x.dt, date.col, key.col) {
  # prepare unit price data table by keying on given columns and 
  #   converting date columns to date class

  require(data.table)

  # convert dates if date.col not blank
  if (!missing(date.col)) {
    if (nchar(date.col[1]) > 1) {
      for (j in date.col) {
        set(x.dt, j=j , 
            value = as.IDate(parse_date_time(x.dt[[j]], c("Ymd", "dmY"))))
        # Since data.table likes integer based dates
      }
    }    
  }

  # add key 
  if (!missing(key.col)) {
    if (nchar(key.col[1]) > 1) {
      setkeyv(x.dt, key.col)
    } 
  }

  # tables here shows a key is set
  tables()
  #NAME NROW NCOL MB COLS             KEY       
  #[1,] x.dt    2    3  1 date,class,units date,class

  return(x.dt)
}

但调用此函数会丢失密钥 - 我希望密钥在传回时保留。

my.key.cols <-c("date", "class") # key columns

dt <- PrepareData(dt, "date", my.key.cols) 

tables()
#NAME NROW NCOL MB COLS             KEY
#[1,] dt      2    3  1 date,class,units    
#
# why did dt not keep the key? How should I fix this?

编辑:已解决 DT包装导致问题

R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggplot2_2.0.0    DT_0.1           lubridate_1.5.0  data.table_1.9.6

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.2      digest_0.6.8     plyr_1.8.3       chron_2.3-47     grid_3.2.2       gtable_0.1.2     magrittr_1.5    
 [8] scales_0.3.0     stringi_1.0-1    tools_3.2.2      stringr_1.0.0    htmlwidgets_0.5  munsell_0.4.2    colorspace_1.2-6
[15] htmltools_0.3 

删除DT修复此

1 个答案:

答案 0 :(得分:1)

非常感谢 @David Arenburg 确认它在他的身边工作并指出旧的冲突板栗!

运行sessionInfo()显示我已加载DT个包。这与data.table相矛盾。删除包修复了错误。

R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggplot2_2.0.0    DT_0.1           lubridate_1.5.0  data.table_1.9.6

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.2      digest_0.6.8     plyr_1.8.3       chron_2.3-47     grid_3.2.2       gtable_0.1.2     magrittr_1.5    
 [8] scales_0.3.0     stringi_1.0-1    tools_3.2.2      stringr_1.0.0    htmlwidgets_0.5  munsell_0.4.2    colorspace_1.2-6
[15] htmltools_0.3 

<强> FIX

# specifying data.table explicitly with :: helped
data.table::setkeyv(x.dt, key.col)

我将离开此解决方案以防其他人遇到此冲突。