在联接中使用字符向量通过引用分配多列

时间:2019-10-30 15:23:56

标签: r data.table

尝试在RHS和LHS上使用变量列名称进行联接期间创建新列。这是MWE:

dt1 <- data.table(id = c(1,2), year = c(2001, 1999))
dt2 <- data.table(year = c(2001, 2002, 1998, 1999), id = c(1,1,2,2), a=c(50, 100, 30, 22), b=c(1,2,3,4))

cols <- c("a", "b")
t <- 1
dt1[, match_year := year + 1]

希望它能起作用:

dt1[dt2, on = .(id, match_year = year), 
    (paste0(cols, "_t", t)) := get(paste0("i.", cols))][]
   id year match_year a_t1 b_t1
1:  1 2001       2002  100  100
2:  2 1999       2000   NA   NA

没有错误消息,但似乎赋值仅使用列表中的第一个元素,b_t1应该为2。

这可以,但是效果不佳

dt1[dt2, on = .(id, match_year = year), 
    (paste0(cols, "_t", t)) := list(i.a, i.b)][]
   id year match_year a_t1 b_t1
1:  1 2001       2002  100    2
2:  2 1999       2000   NA   NA

我在LHS和RHS上都弄清楚变量名的唯一方法是使用eval(),但这很难读懂。

txt <- paste0("dt1[dt2, on = .(id, match_year = year), ",
              "c(", paste0("\"", cols, "_t", t, "\"", collapse = ", "), ") := ",
              "list(", paste0("i.", cols, collapse = ", "), ")][]")
eval(parse(text = txt))
   id year match_year a_t1 b_t1
1:  1 2001       2002  100    2
2:  2 1999       2000   NA   NA

我正在使用data.table 1.12.7 IN DEVELOPMENT built 2019-10-29 16:48:24 UTC;

1 个答案:

答案 0 :(得分:0)

这是即兴表演吗?

dt1[, (paste0(cols, "_t", t)) := dt2[.SD, on = .(id, year = match_year), mget(cols)]]