考虑以下示例代码,该代码基于我遇到的实际问题。以下所有内容都可以毫无问题地运行。
library(data.table)
pers_ids <- expand.grid(LETTERS,LETTERS,1:26,LETTERS,LETTERS)
pers_ids <- paste(pers_ids$Var1,
pers_ids$Var2,
pers_ids$Var3,
pers_ids$Var4,
pers_ids$Var5,
sep = "")
dt_pers_scores <- data.table(pers_id = pers_ids,
score1 = sample(x = 0:1,size = length(pers_ids),replace = TRUE,prob = c(0.4,0.6)),
score2 = sample(x = 0:1,size = length(pers_ids),replace = TRUE,prob = c(0.1,0.9)),
score3 = sample(x = 0:1,size = length(pers_ids),replace = TRUE,prob = c(0.2,0.8)),
score4 = sample(x = 0:1,size = length(pers_ids),replace = TRUE,prob = c(0.7,0.3)),
score5 = sample(x = 0:1,size = length(pers_ids),replace = TRUE,prob = c(0.6,0.4)))
dt_pers_group_mapping <- data.table(pers_id = pers_ids,
group_id = sample(x = sample(x = LETTERS,size = 10,replace = FALSE),
size = length(pers_ids),
replace = TRUE))
# Set keys
setkey(dt_pers_scores,
pers_id)
setkey(dt_pers_group_mapping,
pers_id)
# Join tables and summarize
dt_group_scores <- dt_pers_group_mapping[dt_pers_scores[,list(pers_id,score1,score2,score4)]]
dt_group_scores <- dt_group_scores[,!"pers_id",with = FALSE][,as.list(ifelse(test = colSums(.SD) > 0,yes = 1,no = 0)),by = group_id]
然而,一旦这个运行,我的R会议将不时发布警告,似乎是随机的。我无法在R的控制台版本中实现这一点,仅在RStudio Server中。我一直在努力让这一切发生 - 似乎几乎是随机发生的。
确实发生时,以下是警告:
Warning messages:
1: In `[.data.table`(dt_group_scores, , !"pers_id", with = FALSE) :
column(s) not removed because not found: pers_id
2: In `[.data.table`(dt_group_scores, , !"pers_id", with = FALSE) :
column(s) not removed because not found: pers_id
3: In `[.data.table`(dt_group_scores, , !"pers_id", with = FALSE) :
column(s) not removed because not found: pers_id
4: In `[.data.table`(dt_group_scores, , !"pers_id", with = FALSE) :
column(s) not removed because not found: pers_id
5: In `[.data.table`(dt_group_scores, , !"pers_id", with = FALSE) :
column(s) not removed because not found: pers_id
6: In `[.data.table`(dt_group_scores, , !"pers_id", with = FALSE) :
column(s) not removed because not found: pers_id
我的直觉是,我通过创建汇总表的方式做错了 - 某种方式我通过不理解传递引用或者沿着这些方式的东西弄乱了。但我不确定,特别是因为问题似乎仅限于RStudio Server(我没有在桌面RStudio中尝试过)。什么可能导致这些警告信息?
编辑添加:根据评论,我在执行代码之前运行sessionInfo()
,但在加载data.table
之后。
R version 3.3.0 (2016-05-03)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.4 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8
[8] LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.9.6
loaded via a namespace (and not attached):
[1] tools_3.3.0 chron_2.3-47