readRDS()加载额外的包

时间:2013-10-02 20:56:38

标签: r serialization

在什么情况下R中的readRDS()函数尝试加载包/命名空间?我很惊讶在新的R会议中看到以下内容:

> loadedNamespaces()
[1] "base"      "datasets"  "graphics"  "grDevices" "methods"   "stats"    
[7] "tools"     "utils"    
> x <- readRDS('../../../../data/models/my_model.rds')
There were 19 warnings (use warnings() to see them)
> loadedNamespaces()
 [1] "base"         "class"        "colorspace"   "data.table"  
 [5] "datasets"     "dichromat"    "e1071"        "earth"       
 [9] "evaluate"     "fields"       "formatR"      "gbm"         
[13] "ggthemes"     "graphics"     "grDevices"    "grid"        
[17] "Iso"          "knitr"        "labeling"     "lattice"     
[21] "lubridate"    "MASS"         "methods"      "munsell"     
[25] "plotmo"       "plyr"         "proto"        "quantreg"    
[29] "randomForest" "RColorBrewer" "reshape2"     "rJava"       
[33] "scales"       "spam"         "SparseM"      "splines"     
[37] "stats"        "stringr"      "survival"     "tools"       
[41] "utils"        "wra"          "wra.ops"      "xlsx"        
[45] "xlsxjars"     "xts"          "zoo"     

如果这些新软件包中的任何一个都不可用,readRDS()将失败。

提到的19个警告是:

> warnings()
Warning messages:
1: replacing previous import ‘hour’ when loading ‘data.table’
2: replacing previous import ‘last’ when loading ‘data.table’
3: replacing previous import ‘mday’ when loading ‘data.table’
4: replacing previous import ‘month’ when loading ‘data.table’
5: replacing previous import ‘quarter’ when loading ‘data.table’
6: replacing previous import ‘wday’ when loading ‘data.table’
7: replacing previous import ‘week’ when loading ‘data.table’
8: replacing previous import ‘yday’ when loading ‘data.table’
9: replacing previous import ‘year’ when loading ‘data.table’
10: replacing previous import ‘here’ when loading ‘plyr’
11: replacing previous import ‘hour’ when loading ‘data.table’
12: replacing previous import ‘last’ when loading ‘data.table’
13: replacing previous import ‘mday’ when loading ‘data.table’
14: replacing previous import ‘month’ when loading ‘data.table’
15: replacing previous import ‘quarter’ when loading ‘data.table’
16: replacing previous import ‘wday’ when loading ‘data.table’
17: replacing previous import ‘week’ when loading ‘data.table’
18: replacing previous import ‘yday’ when loading ‘data.table’
19: replacing previous import ‘year’ when loading ‘data.table’

显然它正在加载类似lubridate然后data.table的内容,因此会产生命名空间冲突。

FWIW,unserialize()给出相同的结果。

我真正想要的是加载这些对象而不加载保存它们的人当时似乎已经加载的所有东西,这就像它正在做的那样。

更新:以下是对象x中的类:

> classes <- function(x) {
    cl <- c()
    for(i in x) {
      cl <- c(cl, if(is.list(i)) c(class(i), classes(i)) else class(i))
    }
    cl
  }
> unique(classes(x))
 [1] "list"              "numeric"           "rq"               
 [4] "terms"             "formula"           "call"             
 [7] "character"         "smooth.spline"     "integer"          
[10] "smooth.spline.fit"

qr来自quantreg包,其余全部来自basestats

2 个答案:

答案 0 :(得分:5)

确定。这可能不是一个有用的答案(需要更多细节),但我认为它至少是“在什么情况下......”的一部分。

首先,我认为它并非特定于readRDS,但对于可以save编辑的任何load'对象的工作方式相同。

“在什么情况下”部分:当保存的对象包含具有包/命名空间环境作为父项的环境时。或者当它包含一个其环境是包/名称空间环境的函数时。

require(Matrix)
foo <- list(
   a = 1,
   b = new.env(parent=environment(Matrix)),
   c = "c")
save(foo, file="foo.rda")
loadedNamespaces()   # Matrix is there!
detach("package:Matrix")
unloadNamespace("Matrix")
loadedNamespaces()   # no Matrix there!
load("foo.rda")
loadedNamespaces()   # Matrix is back again

以下也有效:

require(Matrix)
bar <- list(
   a = 1,
   b = force,
   c = "c")
environment(bar$b) <- environment(Matrix)
save(bar, file="bar.rda")
loadedNamespaces()      # Matrix is there!
detach("package:Matrix")
unloadNamespace("Matrix")
loadedNamespaces()      # no Matrix there!
load("bar.rda")
loadedNamespaces()      # Matrix is back!

我没有尝试,但没有理由为什么它不应该与saveRDS / readRDS以相同的方式工作。解决方案:如果这对保存的对象没有任何损害(即,如果您确定实际上不需要环境),则可以通过替换它们来删除父环境,例如,通过将parent.env设置为有意义的内容。所以使用上面的foo

parent.env(foo$b) <- baseenv()
save(foo, file="foo.rda")
loadedNamespaces()        # Matrix is there ....
unloadNamespace("Matrix")
loadedNamespaces()        # no Matrix there ...
load("foo.rda")
loadedNamespaces()        # still no Matrix ...

答案 1 :(得分:1)

我提出的一个痛苦的解决方法是通过令人讨厌的评估来清理它附着的任何环境的对象:

sanitizeEnvironments <- function(obj) {
    tc <- textConnection(NULL, 'w')
    dput(obj, tc)
    source(textConnection(textConnectionValue(tc)))$value
}

我可以使用旧对象,通过此函数运行它,然后再次执行saveRDS()。然后加载新对象不会在我的命名空间中遍布块。