Question

我有data.table：

> (a <- data.table(id=c(1,1,1,2,2,3),
                   attribute=c("a","b","c","a","b","c"),
                   importance=1:6,
                   key=c("id","importance")))
   id attribute importance
1:  1         a          1
2:  1         b          2
3:  1         c          3
4:  2         a          4
5:  2         b          5
6:  3         c          6

我想：

- 1 - 按递减顺序中的第二个键对其进行排序（即，最重要的属性应该首先出现）

- 2 - 为每个id选择 top 2（或10）属性，即：

   id attribute importance
3:  1         c          3
2:  1         b          2
5:  2         b          5
4:  2         a          4
6:  3         c          6

- 3 - 转动上述内容：

id  attribute.1 importance.1 attribute.2 importance.2
 1            c            3           b            2
 2            b            5           a            4
 3            c            6          NA           NA

似乎最后一次操作可以通过以下方式完成：

a[,{ 
  tmp <- .SD[.N:1]; 
  list(a1 = tmp$attribute[1], 
       i1 = tmp$importance[1])
}, by=id]

这是正确的方法吗？

我如何完成前两项任务？

Answer 1

我执行前两个任务：

a[a[, .I[.N:(.N-1)], by=list(id)]$V1]

内部a[, .I[.N:(.N-1)], ,by=list(id)]以id中每个唯一群组所需的顺序为您提供索引。然后使用a列（其中包含您需要的顺序的索引）对V1进行分组。

你必须在这里处理负面指数，可能是这样的：

a[a[, .I[seq.int(.N, max(.N-1L, 1L))], by=list(id)]$V1]

从data.table中提取最佳属性

1 个答案: