有关未删除列的R data.table警告,仅在RStudio中?

时间:2016-06-21 22:31:39

标签: r data.table rstudio-server

考虑以下示例代码,该代码基于我遇到的实际问题。以下所有内容都可以毫无问题地运行。

library(data.table)

pers_ids <- expand.grid(LETTERS,LETTERS,1:26,LETTERS,LETTERS)
pers_ids <- paste(pers_ids$Var1,
                  pers_ids$Var2,
                  pers_ids$Var3,
                  pers_ids$Var4,
                  pers_ids$Var5,
                  sep = "")

dt_pers_scores <- data.table(pers_id = pers_ids,
                             score1 = sample(x = 0:1,size = length(pers_ids),replace = TRUE,prob = c(0.4,0.6)),
                             score2 = sample(x = 0:1,size = length(pers_ids),replace = TRUE,prob = c(0.1,0.9)),
                             score3 = sample(x = 0:1,size = length(pers_ids),replace = TRUE,prob = c(0.2,0.8)),
                             score4 = sample(x = 0:1,size = length(pers_ids),replace = TRUE,prob = c(0.7,0.3)),
                             score5 = sample(x = 0:1,size = length(pers_ids),replace = TRUE,prob = c(0.6,0.4)))

dt_pers_group_mapping <- data.table(pers_id = pers_ids,
                                    group_id = sample(x = sample(x = LETTERS,size = 10,replace = FALSE),
                                                      size = length(pers_ids),
                                                      replace = TRUE))

# Set keys
setkey(dt_pers_scores,
       pers_id)
setkey(dt_pers_group_mapping,
       pers_id)

# Join tables and summarize
dt_group_scores <- dt_pers_group_mapping[dt_pers_scores[,list(pers_id,score1,score2,score4)]]
dt_group_scores <- dt_group_scores[,!"pers_id",with = FALSE][,as.list(ifelse(test = colSums(.SD) > 0,yes = 1,no = 0)),by = group_id]

然而,一旦这个运行,我的R会议将不时发布警告,似乎是随机的。我无法在R的控制台版本中实现这一点,仅在RStudio Server中。我一直在努力让这一切发生 - 似乎几乎是随机发生的。

确实发生时,以下是警告:

Warning messages:
1: In `[.data.table`(dt_group_scores, , !"pers_id", with = FALSE) :
  column(s) not removed because not found: pers_id
2: In `[.data.table`(dt_group_scores, , !"pers_id", with = FALSE) :
  column(s) not removed because not found: pers_id
3: In `[.data.table`(dt_group_scores, , !"pers_id", with = FALSE) :
  column(s) not removed because not found: pers_id
4: In `[.data.table`(dt_group_scores, , !"pers_id", with = FALSE) :
  column(s) not removed because not found: pers_id
5: In `[.data.table`(dt_group_scores, , !"pers_id", with = FALSE) :
  column(s) not removed because not found: pers_id
6: In `[.data.table`(dt_group_scores, , !"pers_id", with = FALSE) :
  column(s) not removed because not found: pers_id

我的直觉是,我通过创建汇总表的方式做错了 - 某种方式我通过不理解传递引用或者沿着这些方式的东西弄乱了。但我不确定,特别是因为问题似乎仅限于RStudio Server(我没有在桌面RStudio中尝试过)。什么可能导致这些警告信息?

编辑添加:根据评论,我在执行代码之前运行sessionInfo(),但在加载data.table之后。

R version 3.3.0 (2016-05-03)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.4 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8      
 [8] LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.9.6

loaded via a namespace (and not attached):
[1] tools_3.3.0  chron_2.3-47

0 个答案:

没有答案