另一个data.table中的data.table引用列,名称为

时间:2016-08-17 09:43:34

标签: r data.table

我有以下data.tables:

Comparison <- data.table(code = c("AAA", "BBB"),
                         elem1 = c(1, 2),
                         elem2 = c(4, 4))

DT <- data.table(A = c("AAA", "AAA", "AAA", "AAA"),
                 B = c("BBB", "BBB", "BBB", "BBB"),
                 C = c(1, 2, 3, 4))

现在,我想根据ComparisonDT列的比较添加新列。以下命令生成预期输出:

DT[, newCol := {ifelse( abs(C - Comparison[code == "AAA", elem2]) == 0, "0", "1")}]

Output:

     A   B C newCol
1: AAA BBB 1      1
2: AAA BBB 2      1
3: AAA BBB 3      1
4: AAA BBB 4      0

但是,如果不是对列A的列值进行硬编码,而是使用列本身:

DT[, newCol := {ifelse( abs(C - Comparison[code == A, elem2]) > 0, "0", "1")}]

它输出以下错误,我不知道该如何避免:

Error in `[.data.table`(Comparison, code == A, elem2) : 
  RHS of == is length 4 which is not 1 or nrow (2). For robustness, no recycling is allowed (other than of length 1 RHS). Consider %in% instead.

在我看来,该操作没有对ADT的{​​{1}}列的元素进行矢量化,我真的不明白为什么,因为列的元素Comparison被正确使用(即它单独使用C的元素,但不使用C的元素。我怎么能进行这种比较?

非常感谢任何帮助。

3 个答案:

答案 0 :(得分:1)

一种解决方案是进行数据合并。

require(data.table)

Comparison <- data.table(code = c("AAA", "BBB"),
                         elem1 = c(1, 2),
                         elem2 = c(4, 4))
Comparison

DT <- data.table(A = c("AAA", "AAA", "AAA", "AAA"),
                 B = c("BBB", "BBB", "BBB", "BBB"),
                 C = c(1, 2, 3, 4))
DT

tmp <- merge(DT, Comparison, by.x = "A", by.y = "code")
tmp[, newCol := as.character(as.integer(C != elem2))]
tmp

答案 1 :(得分:1)

我们可以将joinon

一起使用
DT[Comparison, newCol := as.integer(C != elem2), on = c("A" = "code"), nomatch = 0]
DT
#     A   B C newCol
#1: AAA BBB 1      1
#2: AAA BBB 2      1
#3: AAA BBB 3      1
#4: AAA BBB 4      0

答案 2 :(得分:0)

如果您阅读错误消息,请说明Consider %in% instead.

确实,将==替换为%in%,无需使用joinmerge即可。

DT[, newCol := {ifelse( abs(C - Comparison[code %in% A, elem2]) = 0, "0", "1")}

DT

#     A   B C newCol
#1: AAA BBB 1      1
#2: AAA BBB 2      1
#3: AAA BBB 3      1
#4: AAA BBB 4      0