此问题与:Add a column to a data frame that index the number of occurrences in a group有关 我有以下data.table按前两列排序。
ddt = structure(list(Unit = structure(c(1L, 1L, 2L, 2L, 3L, 3L), .Label = c("A",
"A1", "B"), class = "factor"), Anything = c(3.4, 6.9, 1.1, 2.2,
2, 3), index = c(0, 0, 0, 0, 0, 0)), .Names = c("Unit", "Anything",
"index"), row.names = c(NA, -6L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x8948f68>, sorted = c("Unit",
"Anything"))
ddt
Unit Anything index
1: A 3.4 0
2: A 6.9 0
3: A1 1.1 0
4: A1 2.2 0
5: B 2.0 0
6: B 3.0 0
每个单位的索引列将由1,2,3 ...填充。对于data.frame,我可以通过以下方式完成:
for(U in unique(ddt$Unit)){
ddt[ddt$Unit==U,]$index = 1:length(ddt[ddt$Unit==U,]$Unit)
}
ddt
Unit Anything index
1 A 3.4 1
3 A 6.9 2
4 A1 1.1 1
2 A1 2.2 2
5 B 2.0 1
6 B 3.0 2
但是如何使用data.table命令呢?谢谢你的帮助。
答案 0 :(得分:2)
尝试
ddt[, indx:=1:.N, by=Unit]
# Unit Anything indx
#1: A 3.4 1
#2: A 6.9 2
#3: A1 1.1 1
#4: A1 2.2 2
#5: B 2.0 1
#6: B 3.0 2
答案 1 :(得分:1)
试试这个:
ddt[, index := as.numeric(seq_len(.N)), by="Unit"]
ddt
Unit Anything index
1: A 3.4 1
2: A 6.9 2
3: A1 1.1 1
4: A1 2.2 2
5: B 2.0 1
6: B 3.0 2
答案 2 :(得分:1)
一个问题是您无法使用:=
更改列的类(因为索引是类型double
,理想情况下您需要整数)。我建议删除index
并使用:=
重新创建:
ddt$index = NULL
ddt[,index:= 1:nrow(.SD), by=Unit]
> ddt
Unit Anything index
1: A 3.4 1
2: A 6.9 2
3: A1 1.1 1
4: A1 2.2 2
5: B 2.0 1
6: B 3.0 2