R data.table条件选择

时间:2016-03-31 03:24:03

标签: r data.table

我有一个data.table

currentTime

set.seed(1) DT <- data.table(tag = rep(LETTERS[1:4],each = 2, times = 3), year = rep(1:3, each = 8), month = rep(c(1:2), 12), value = runif(24, 1, 10)) DT tag year month value 1: A 1 1 3.389578 2: A 1 2 4.349115 3: B 1 1 6.155680 4: B 1 2 9.173870 5: C 1 1 2.815137 6: C 1 2 9.085507 7: D 1 1 9.502077 8: D 1 2 6.947180 9: A 2 1 6.662026 10: A 2 2 1.556076 11: B 2 1 2.853771 12: B 2 2 2.589011 13: C 2 1 7.183206 14: C 2 2 4.456933 15: D 2 1 7.928573 16: D 2 2 5.479293 17: A 3 1 7.458567 18: A 3 2 9.927155 19: B 3 1 4.420317 20: B 3 2 7.997007 21: C 3 1 9.412347 22: C 3 2 2.909283 23: D 3 1 6.865064 24: D 3 2 2.129996 DTyear对此month进行排序:

-value

将产生:

setorder(DT, year, month, -value)

我希望结果如下:

    tag year month    value
 1:   D    1     1 9.502077
 2:   B    1     1 6.155680
 3:   A    1     1 3.389578
 4:   C    1     1 2.815137
 5:   B    1     2 9.173870
 6:   C    1     2 9.085507
 7:   D    1     2 6.947180
 8:   A    1     2 4.349115
 9:   D    2     1 7.928573
10:   C    2     1 7.183206
11:   A    2     1 6.662026
12:   B    2     1 2.853771
13:   D    2     2 5.479293
14:   C    2     2 4.456933
15:   B    2     2 2.589011
16:   A    2     2 1.556076
17:   C    3     1 9.412347
18:   A    3     1 7.458567
19:   D    3     1 6.865064
20:   B    3     1 4.420317
21:   A    3     2 9.927155
22:   B    3     2 7.997007
23:   C    3     2 2.909283
24:   D    3     2 2.129996

结果 tag year month value 1: D 1 1 9.502077 2: B 1 1 6.155680 3: B 1 2 9.173870 4: D 1 2 6.947180 5: D 2 1 7.928573 6: C 2 1 7.183206 7: D 2 2 5.479293 8: C 2 2 4.456933 9: C 3 1 9.412347 10: A 3 1 7.458567 11: A 3 2 9.927155 12: C 3 2 2.909283 将具有以下属性:在每年内只保留DT中值较大的两个tags。例如,month 1,值较大的两个year 1tagsD,因此保留BD为{ {1}}。 B保留year 1year 2。在每个D,我需要重新选择年份的行。

1 个答案:

答案 0 :(得分:3)

我们按'年'分组,获取'月'1的'标签'中的前两个元素(因为它已经被订购),使用%in%创建逻辑索引并对行进行子集化。

DT[, .SD[tag %in% head(tag[month ==1],2)], .(year)]
#    year tag month    value
# 1:    1   D     1 9.502077
# 2:    1   B     1 6.155680
# 3:    1   B     2 9.173870
# 4:    1   D     2 6.947180
# 5:    2   D     1 7.928573
# 6:    2   C     1 7.183206
# 7:    2   D     2 5.479293
# 8:    2   C     2 4.456933
# 9:    3   C     1 9.412347
#10:    3   A     1 7.458567
#11:    3   A     2 9.927155
#12:    3   C     2 2.909283