在help page for special-symbols in data.table中,它表示“ .N
也可以在i
中使用。”我该怎么办?
例如,我希望以下代码仅保留组中只有一个元素的行。
> library(data.table)
> set.seed(734)
> dt <- data.table(x = c(rep("a", 5), rep("b", 3), "c", "d", "e"),
y = runif(11))
> dt
x y
1: a 0.46431448
2: a 0.57148294
3: a 0.30197960
4: a 0.06394102
5: a 0.08793526
6: b 0.62994539
7: b 0.64693916
8: b 0.79671939
9: c 0.60865117
10: d 0.86025196
11: e 0.21562992
> dt[.N == 1, .(y), by = .(x)]
Empty data.table (0 rows) of 2 cols: x,y
我希望它的结果与:
> dt[, .(n = .N, y = y), by = .(x)][n == 1, .(x, y)]
x y
1: c 0.6086512
2: d 0.8602520
3: e 0.2156299
如果不喜欢上面的示例,我该如何在.N
中将i
中的data.table
用于.grid {
display: grid;
grid-template-rows: 1fr 1fr 1fr;
grid-template-columns: 1fr 1fr 1fr;
grid-auto-flow: column;
width: 300px;
}
.item {
border: 1px solid red;
padding: 15px;
}
.stretch {
grid-column: span 20;
}
?
答案 0 :(得分:0)
.N
中未使用基于i
的逻辑表达式。相反,请从.I
中的表达式获取行索引(j
),提取($V1
)索引并对行进行子集
dt[dt[, .I[.N == 1], by = .(x)]$V1]
# x y
#1: c 0.6086512
#2: d 0.8602520
#3: e 0.2156299
此外,该表达式可用于对.SD
进行子集化(可能很慢)
dt[, .SD[.N == 1], .(x)]
关于?.N
的用法,
.SD,.BY,.N,.I和.GRP是在j中使用的只读符号。 .N也可以在i中使用。
但是,它没有提到什么背景。如果我们仅使用i
表达式
dt[.N > 2] # works
或者使用i
和j
,
dt[.N > 2, .(x)]
要了解如何调用函数,请使用verbose = TRUE
dt[.N ==1, .SD, by = .(x), verbose = TRUE]
#i clause present and columns used in by detected, only these subset: x
#lapply optimization changed j from '.SD' to 'list(y)'
#Old mean optimization is on, left j unchanged.
#Making each group and running j (GForce FALSE) ...
# memcpy contiguous groups took 0.000s for 1 groups
# eval(j) took 0.000s for 1 calls
#0.046s elapsed (0.268s cpu)
#Empty data.table (0 rows and 2 cols): x,y
dt[dt[, .I[.N == 1], by = .(x), verbose = TRUE]$V1]
#Detected that j uses these columns: <none>
#Finding groups using forderv ... 0.032s elapsed (0.033s cpu)
#Finding group sizes from the positions (can be avoided to save RAM) ... 0.033s #elapsed (0.194s cpu)
#lapply optimization is on, j unchanged as '.I[.N == 1]'
#GForce is on, left j unchanged
#Old mean optimization is on, left j unchanged.
#Making each group and running j (GForce FALSE) ... dogroups: growing from 0 to #2 rows
#dogroups: growing from 2 to 4 rows
#Wrote less rows (3) than allocated (4).
# memcpy contiguous groups took 0.000s for 5 groups
# eval(j) took 0.000s for 5 calls
0.046s elapsed (0.273s cpu)