R中的data.table不能使用dput

时间:2014-08-27 17:20:22

标签: r data.table

我有以下data.table,我不能使用dput命令的输出来重新创建它:

> ddt
   Unit Anything index new
1:    A      3.4     1   1
2:    A      6.9     2   1
3:   A1      1.1     1   2
4:   A1      2.2     2   2
5:    B      2.0     1   3
6:    B      3.0     2   3
> 
> 
> str(ddt)
Classes ‘data.table’ and 'data.frame':  6 obs. of  4 variables:
 $ Unit    : Factor w/ 3 levels "A","A1","B": 1 1 2 2 3 3
 $ Anything: num  3.4 6.9 1.1 2.2 2 3
 $ index   : num  1 2 1 2 1 2
 $ new     : int  1 1 2 2 3 3
 - attr(*, ".internal.selfref")=<externalptr> 
 - attr(*, "sorted")= chr  "Unit" "Anything"
> 
> 
> dput(ddt)
structure(list(Unit = structure(c(1L, 1L, 2L, 2L, 3L, 3L), .Label = c("A", 
"A1", "B"), class = "factor"), Anything = c(3.4, 6.9, 1.1, 2.2, 
2, 3), index = c(1, 2, 1, 2, 1, 2), new = c(1L, 1L, 2L, 2L, 3L, 
3L)), .Names = c("Unit", "Anything", "index", "new"), row.names = c(NA, 
-6L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x8948f68>, sorted = c("Unit", 
"Anything"))
> 

在粘贴时,我收到以下错误:

> dt = structure(list(Unit = structure(c(1L, 1L, 2L, 2L, 3L, 3L), .Label = c("A", 
+ "A1", "B"), class = "factor"), Anything = c(3.4, 6.9, 1.1, 2.2, 
+ 2, 3), index = c(1, 2, 1, 2, 1, 2), new = c(1L, 1L, 2L, 2L, 3L, 
+ 3L)), .Names = c("Unit", "Anything", "index", "new"), row.names = c(NA, 
+ -6L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x8948f68>, sorted = c("Unit", 
Error: unexpected '<' in:
"3L)), .Names = c("Unit", "Anything", "index", "new"), row.names = c(NA, 
-6L), class = c("data.table", "data.frame"), .internal.selfref = <"
> "Anything"))
Error: unexpected ')' in ""Anything")"

问题出在哪里?如何纠正?谢谢你的帮助。

3 个答案:

答案 0 :(得分:11)

问题是dput打印出外部指针地址(data.table在内部使用,并在需要时重建),这是你真正无法实现的。

如果您手动删除了.internal.selfref部分,它将正常运行,但data.table对某些操作的一次性投诉除外。

您可以向此data.table添加一个FR,但需要从data.table修改基本功能,类似于当前处理rbind的方式。

答案 1 :(得分:4)

我也发现这种行为很烦人。所以我创建了自己的dput函数,忽略了.internal.selfref属性。

dput <- function (x, file = "", control = c("keepNA", "keepInteger", 
                                    "showAttributes")) 
{
  if (is.character(file)) 
    if (nzchar(file)) {
      file <- file(file, "wt")
      on.exit(close(file))
    }
  else file <- stdout()
  opts <- .deparseOpts(control)
  # adding these three lines for data.tables
  if (is.data.table(x)) {
    setattr(x, '.internal.selfref', NULL)
  }
  if (isS4(x)) {
    clx <- class(x)
    cat("new(\"", clx, "\"\n", file = file, sep = "")
    for (n in .slotNames(clx)) {
      cat("    ,", n, "= ", file = file)
      dput(slot(x, n), file = file, control = control)
    }
    cat(")\n", file = file)
    invisible()
  }
  else .Internal(dput(x, file, opts))
}

答案 2 :(得分:0)

如果您已经dput该文件,并且您不想在dget之前手动编辑,则可以使用以下内容

data.table.parse<-function (file = "", n = NULL, text = NULL, prompt = "?", keep.source = getOption("keep.source"), 
                            srcfile = NULL, encoding = "unknown") 
{
  keep.source <- isTRUE(keep.source)
  if (!is.null(text)) {
    if (length(text) == 0L) 
      return(expression())
    if (missing(srcfile)) {
      srcfile <- "<text>"
      if (keep.source) 
        srcfile <- srcfilecopy(srcfile, text)
    }
    file <- stdin()
  }
  else {
    if (is.character(file)) {
      if (file == "") {
        file <- stdin()
        if (missing(srcfile)) 
          srcfile <- "<stdin>"
      }
      else {
        filename <- file
        file <- file(filename, "r")
        if (missing(srcfile)) 
          srcfile <- filename
        if (keep.source) {
          text <- readLines(file, warn = FALSE)
          if (!length(text)) 
            text <- ""
          close(file)
          file <- stdin()
          srcfile <- srcfilecopy(filename, text, file.mtime(filename), 
                                 isFile = TRUE)
        }
        else {
          text <- readLines(file, warn = FALSE)
          if (!length(text)) {
            text <- ""
          } else {
            text <- gsub("(, .internal.selfref = <pointer: 0x[0-9A-Fa-f]+>)","",text,perl=TRUE)
          }
          on.exit(close(file))
        }
      }
    }
  }
  #  text <- gsub("(, .internal.selfref = <pointer: 0x[0-9A-F]+>)","",text)
  .Internal(parse(file, n, text, prompt, srcfile, encoding))
}
data.table.get <- function(file, keep.source = FALSE)
  eval(data.table.parse(file = file, keep.source = keep.source))
dtget <- data.table.get

然后将dget的来电更改为dtget。请注意,由于内联解析,这会使dtget慢于dget,因此仅在您可以检索data.table类型的对象的情况下使用它。