Question

在data.table中，默认情况下with = TRUE和j会在x的框架内评估。然后，它有助于将列名用作变量。当with = FALSE时，j是要选择的名称或位置的向量。

我设法找到了with = FALSE的一些例子。

set.seed(1234) DT <- data.table(x=rep(c(1,2,3),each=4), y=c("A","B"), v=sample(1:100,12)) ## The askers's solution #first step is to create cumsum columns colNames <- c("x","v"); newColNames <- paste0("SUM.",colNames) DT[, newColNames := lapply(.SD,cumsum) ,by=y, .SDcols = colNames, with=FALSE]; test <- DT[, newColNames:=lapply(.SD,cumsum) ,by=y, .SDcols=colNames, with=TRUE];

我们可以检查DT是：

> DT # setting `with=FALSE` - what I require x y v SUM.x SUM.v 1: 1 A 12 1 12 2: 1 B 62 1 62 3: 1 A 60 2 72 4: 1 B 61 2 123 5: 2 A 83 4 155 6: 2 B 97 4 220 7: 2 A 1 6 156 8: 2 B 22 6 242 9: 3 A 99 9 255 10: 3 B 47 9 289 11: 3 A 63 12 318 12: 3 B 49 12 338

和test是：

> test # this is when setting " with = TRUE" x y v newColNames 1: 1 A 12 1 2: 1 B 62 1 3: 1 A 60 2 4: 1 B 61 2 5: 2 A 83 4 6: 2 B 97 4 7: 2 A 1 6 8: 2 B 22 6 9: 3 A 99 9 10: 3 B 47 9 11: 3 A 63 12 12: 3 B 49 12

我不明白为什么设置with = TRUE时结果如此。所以我的问题基本上是with = TRUE何时有用？

我不明白为什么默认设置是with = TRUE，尽管必须有充分的理由。

非常感谢！

Answer 1

我明白你的观点。我们已将with=TRUE|FALSE与:=结合使用。由于with=TRUE是指左侧是:=的侧边还是右侧，因此并不含糊。相反，现在首选用括号括住:=的LHS。

DT[, x.sum:=cumsum(x)]     # assign cumsum(x) to the column called "x.sum"
DT[, (target):=cumsum(x)]  # assign to the name contained in target's value

正如Justin所提到的，我们大多数时间都会分配给我们预先知道的新列或现有列。换句话说，最常见的是，分配给的列不是保存在变量中。我们做了很多，所以需要方便。也就是说，data.table是灵活的，并允许您以编程方式定义目标列名称。

我想可能会有一个案例：

DT[, "x.sum":=cumsum(x)]   # assign cumsum(x) to the column called "x.sum"
DT[, x.sum:=cumsum(x)]     # assign to the name contained in x.sum's contents.

但是，由于:=是一个赋值运算符，而j是在DT的范围内进行评估，对我来说，如果DT[, x.sum:=cumsum(x)]没有赋值，那将会很困惑到x.sum列。

显式括号，即(target):=，暗示某种评估，因此语法更清晰。无论如何，在我的脑海里。当然，您也可以直接在paste0的左侧拨打:=等，而无需with=FALSE;如，

DT[, paste0("SUM.",colNames) := lapply(.SD, ...), by=...]

简而言之，我在使用with时从不使用:=。

当使用：=时，为什么= TRUE是默认值？

1 个答案: