我正在尝试使用data.table计算列;
此处的目标是为运行时计算speedup
列,相对于1个线程。
setup mode name threads runtime
1: A short K 1 10
2: A short K 1 11
3: A short K 1 10
4: A short K 2 4
5: A short K 2 5
6: A short K 2 8
7: B short K 1 11
8: B short K 1 12
9: B short K 1 10
10: B short K 2 9
11: B short K 2 6
12: B short K 2 8
这就是我得到的......
valT[, speedup:=mean(runtime)/runtime, by=c("setup","threads","name","mode") ]
当然,出现的加速不是我想要的;例如,第一行加速计算应为1.1;第四名应该是2.75。这就是我需要缩小选择范围的原因。 which
似乎是答案,但我无法正确部署它:
valT[, speedup:=mean(runtime)/runtime, which(threads==1), by=c("setup","threads","name","mode") ]
Error in `[.data.table`(valT, , runtime/mean(runtime), which(threads == :
Provide either 'by' or 'keyby' but not both
数据:
valT = data.table(structure(list(setup = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"),
mode = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L), .Label = " short", class = "factor"), name = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = " K", class = "factor"),
threads = c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L
), runtime = c(10, 11, 10, 4, 5, 8, 11, 12, 10, 9, 6, 8)), .Names = c("setup",
"mode", "name", "threads", "runtime"), class = "data.frame", row.names = c(NA,
-12L)))
答案 0 :(得分:3)
这有效:
valT[, speedup := mean(runtime[threads == 1]) / runtime,
by = c("setup","name","mode")]