用两个变量计算百分比

时间:2016-10-21 16:53:09

标签: r data.table

我正在尝试计算一列相对于另一列的百分比,同时考虑条件和日期。最终结果是每个条件每天有一行与百分比值。我目前的进展可以在下面找到,但我陷入了最后一步。非常感谢任何帮助。

加载数据:

ID <-c(rep("A", 5), rep("B",6), rep("C",4))
Day <- c(1,1,1,2,2,1,1,1,2,2,2,1,1,1,2)
Results1 <- c("x","z","z","z","x","z","x","z","z","z","x","x","z","z","x")
Results2 <- c(1,0,0,1,1,1,2,1,1,1,1,1,1,0,1)

x <- data.table(ID, Day, Results1)
x

计算全球百分比:

sum(x$Results1== "x") / (sum(x$Results1 == "x") + sum(x$Results1 == "z")) * 100

尝试计算每日和条件:

a <- as.data.table(x)[, lapply(.SD, sum(x$Results1== "x") / (sum(x$Results1 == "x") + sum(x$Results1 == "z")) * 100), by .(x$ID, x$Day)]

1 个答案:

答案 0 :(得分:4)

不要在data.table中使用$,因为这会调用total data.table,而不是您要分组的组:

x[, .( (sum(Results1 == "x") / .N) * 100), by = .(ID, Day)]
   ID Day        V1
1:  A   1  33.33333
2:  A   2  50.00000
3:  B   1  33.33333
4:  B   2  33.33333
5:  C   1  33.33333
6:  C   2 100.00000

如果您有多个结果列:

x[, .(lapply(.SD, function(col) {(sum(col == "x") / .N )* 100})), by = .(ID, Day)]