使用data.table
包,我正在使用以reproduce(df
生成的以下数据框
outRes vars ts_length BIAS
1 1t sd 0 -0.046
2 1t sd 3 -0.105
3 1t sd 6 -0.249
4 1t sd 1 -0.024
5 1t sd 1 1.246
6 1t sd 6 0.885
7 1t sd 1 0.280
46 day sd 0 -0.061
47 day sd 3 -0.119
48 day sd 6 -0.256
49 day sd 1 -0.039
50 day sd 1 1.239
51 day sd 6 0.888
52 day sd 1 0.253
268 month LE 1 -0.085
269 month LE 3 -0.147
270 month LE 6 -0.305
df <- structure(list(outRes = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,3L, 3L, 3L),
.Label = c("1t", "day", "month"), class = "factor"),
vars = structure(c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L), .Label = c("H","LE", "sd", "sm2", "Ts2"), class = "factor"),
ts_length = structure(c(1L, 3L, 4L, 2L, 2L, 4L, 2L, 2L, 3L,4L), .Label = c("0", "1", "3", "6"), class = "factor"),
BIAS = c(-0.046,-0.105, -0.249, -0.024, 1.246, 0.885, 0.28, -0.085, -0.147,-0.305)),
.Names = c("outRes", "vars", "ts_length", "BIAS"), class = "data.frame",
row.names = c(1L, 2L, 3L, 4L, 5L, 6L,7L, 268L, 269L, 270L))
首先,我需要找到每组df$BIAS
和df$vars
df$outRes
中的最低值。使用上面的示例outRes=1t
和vars = sd
,最小的BIAS是-0.024,因此我需要打印ts_length
=“1”;对于outRes = day
,我需要ts_length
= 0表示最小BIAS
= -0.061。使用data.table
包,我可以使用
BIAS
的值
dt = as.data.table(df)
dt[,min(abs(BIAS)),by="vars,outRes"]
给我输出
vars outRes V1
1: sd 1t 0.024
2: sm2 1t 2.615
3: Ts2 1t 0.000
4: H 1t 0.735
5: LE 1t 0.018
6: sd day 0.039
7: sm2 day 2.661 etc...
我想要做的是获取与df$ts_length
列对应的V1
。我试过了
setkey(dt,outRes,vars,BIAS)
dt[J(dt[,min(abs(BIAS)),by="outRes,vars"])]
[V1== BIAS,list(ID,ts_length,BIAS,outRes,vars)]
但$vars
的5个等级中有2个消失了,给出了这些结果:
ts_length BIAS outRes vars
1: 3 0.018 1t LE
2: 0 2.615 1t sm2
3: 6 0.000 1t Ts2
4: 0 0.005 day LE
5: 0 2.661 day sm2
我是data.table
的新手并且承认我并不太了解代码本身,所以我也尝试了
setkey(dt,vars,outRes,BIAS)
dt[J(dt[,min(abs(BIAS)),by="vars,outRes"])]
[V1== BIAS,list(ts_length,BIAS,vars,outRes)]
但我也只获得3个等级。怎么了?我怎样才能得到因子vars
的5个等级而不仅仅是3个?
答案 0 :(得分:1)
感谢可重复的例子。 请尝试以下方法:
setkey(dt, vars, outRes)
dt[ CJ(levels(vars), levels(outRes))
, .SD[abs(BIAS) == min(abs(BIAS))]
, .SDcols=c("BIAS", "ts_length")
]
vars outRes BIAS ts_length
1: H 1t NA NA
2: H day NA NA
3: H month NA NA
4: LE 1t NA NA
5: LE day NA NA
6: LE month -0.085 1
7: sd 1t -0.024 1
8: sd day NA NA
9: sd month NA NA
10: sm2 1t NA NA
11: sm2 day NA NA
12: sm2 month NA NA
13: Ts2 1t NA NA
14: Ts2 day NA NA
15: Ts2 month NA NA