Question

几天前，我遇到了关于值分配的以下data.table问题：

重新分配列的值（例如DT1$type） data.table列（例如DT2$description）基于前者的当前值（例如DT1$type == DT2$id）。

我以经典的方式解决了它（即使用for循环），但我注意到data.table的长度增加需要花费很多时间。

因此，我想知道是否有更有效的方法来获得相同的结果？

我的解决方案：

# Define the sample data.tables
DT1 <- data.table( user = c(rep(1,2), rep(2,3), rep(3,3)), 
                   type = c(1,2,1,4,2,3))

DT2 <- data.table( id = 1:4, 
                   description = c( "aa", "bb", "cc", "dd"))

# set the keys
setkeyv(DT1,"user")
setkeyv(DT2, c("id","description"))

# Replace values
for ( i in 1:length(DT1$type) ) { 
  DT1$type[i] <- DT2[ DT2$id == DT1$type[i], description ]
}

Answer 1

您必须在要加入的列上设置键，然后使用[ data.table运算符。例如：

   DT1 <- data.table( user = c(rep(1,2), rep(2,3), rep(3,3)), 
               type = c(1,2,1,4,2,3))
   DT2 <- data.table( id = 1:4, 
               description = c( "aa", "bb", "cc", "dd")) 
   setkeyv(DT1,"type")
   setkeyv(DT2,"id")
   res<-DT1[DT2,]
   #drop the first column
   res[,type:=NULL]

最有效的值从一个data.table替换到另一个DT？

1 个答案: