Question

我有两个data.tables DT和ADT，我想在列a, new.a上加入它们：

R> DT
   a   b
1: 1 1.0
2: 1 1.0
3: 2 2.0
4: 3 3.5
5: 4 4.5
6: 5 5.5

R> ADT
   new.a type
1:     1    3
2:     1    5
3:     2    3
4:     4    5
5:     4    3

R> setkey(DT, a)
R> DT[ADT[, new.a]]
# This is the desired result:
   a   b
1: 1 1.0
2: 1 1.0
3: 1 1.0
4: 1 1.0
5: 2 2.0
6: 4 4.5
7: 4 4.5

data.table将ADT[, new.a]中的数字信息作为一组行号，而不是所需的结果。

DT[ADT[, new.a]] # taking row numbers... even truncating comma-values!
setkey(DT, a)
DT[ADT[, new.a]] # the key sorts the DT, so slightly different result, still using row numbers

如果相反，我以不同的方式定义data.tables，现在包含character列，如果在设置密钥之前尝试连接，我会正确地得到错误，之后我得到了所需的结果。但是有没有办法直接使用数字索引？将整个DT转换为前期字符可能会变慢......

DTchar <- data.table(
  a = as.character(c(1, 2, 1, 3, 4, 5)),
  b = c(1, 2, 1, 3.5, 4.5, 5.5)
)
ADTchar <- data.table(
  new.a = as.character(c(1, 1, 2, 4, 4)),
  type  = as.character(c(3, 5, 3, 5, 3))
)

DTchar[ADTchar[, new.a]] # error - correctly
setkey(DTchar, a)
DTchar[ADTchar[, new.a]] # desired result

Answer 1

首先，您应该使用返回ADT[, new.a]的{{1}}，而不是返回向量的ADT[, list(new.a)]。您还缺少参数data.table。

allow.cartesian = TRUE

来自DT[ADT[, list(new.a)], allow.cartesian = TRUE] ## a b ## 1: 1 1.0 ## 2: 1 1.0 ## 3: 1 1.0 ## 4: 1 1.0 ## 5: 2 2.0 ## 6: 4 4.5 ## 7: 4 4.5中i的文档：

整数和逻辑向量的工作方式与它们在[.data.frame。
中的工作方式相同
字符与x键的第一列匹配。

当我是data.table时，x必须有一个键。我使用x的键加入x，并返回匹配的x行。

R data.table - 整数/数字和字符列的不同连接行为

1 个答案: