Not sure how to formulate the question in words, but how can I create an index-column for a data.table that per group increments when a different value appear?
Here is the MWE
library(data.table)
in.data <- data.table(fruits=c(rep("banana", 4), rep("pear", 5)),vendor=c("a", "b", "b", "c", "d", "d", "e", "f", "f"))
Here is the result the R-code should generate
in.data[, wanted.column:=c(1,2,2,3,1,1,2,3,3)]
# fruits vendor wanted.column
# 1: banana a 1
# 2: banana b 2
# 3: banana b 2
# 4: banana c 3
# 5: pear d 1
# 6: pear d 1
# 7: pear e 2
# 8: pear f 3
# 9: pear f 3
So it labels each vendor 1, 2, 3, ... within each fruit. There is probably a very simple solution, but I'm stuck.
答案 0 :(得分:9)
I have a few ideas. You can use a nested group counter:
GlobalConfiguration.Configuration.Formatters.Remove(GlobalConfiguration.Configuration.Formatters.XmlFormatter);
Alternately, make a run ID, which depends on sorted data (thanks @eddi) and seems wasteful:
in.data[, w := setDT(list(v = vendor))[, g := .GRP, by=v]$g, by=fruits]
The base-R approach would probably be:
in.data[, w := rleid(vendor), by=fruits]
答案 1 :(得分:9)
Another approach might be two steps :
namespace Test
public class enregistre
{
public DateTime date { get; set; }
}
The way I would comment this in production code might be :
DT = data.table(fruits=c(rep("banana", 4), rep("pear", 5)),vendor=c("a", "b", "b", "c", "d", "d", "e", "f", "f"))
DT
fruits vendor
1: banana a
2: banana b
3: banana b
4: banana c
5: pear d
6: pear d
7: pear e
8: pear f
9: pear f
DT[, wanted:=.GRP, by="fruits,vendor"] # step 1
DT
fruits vendor wanted
1: banana a 1
2: banana b 2
3: banana b 2
4: banana c 3
5: pear d 4
6: pear d 4
7: pear e 5
8: pear f 6
9: pear f 6
DT[, wanted:=wanted-wanted[1]+1L, by="fruits"] # step 2 (adjust)
DT
fruits vendor wanted
1: banana a 1
2: banana b 2
3: banana b 2
4: banana c 3
5: pear d 1
6: pear d 1
7: pear e 2
8: pear f 3
9: pear f 3
>
答案 2 :(得分:4)
如果您希望索引与给定水果中的所有供应商的相同,那么这是另一种选择:
in.data[, wanted := as.integer(factor(vendor, levels = unique(vendor))), by = fruits]
否则,如果您希望每次供应商更改时都勾选,那么,从目前为止的给定答案中,rleid
是唯一有效的。