请考虑以下矩阵:
m <- cbind(c("r1","r2","r3","r4","r1","r2","r3","r4"),c(3,2,5,2,5,2,6,4),c(4,3,5,3,7,4,6,7))
对于每一行,我想将行总和除以其条件行总和。也就是说,对于名称为“ r1”的所有行,我想将其行总和除以名称为“ r1”的所有行的总和。因此,对于第一行为“(3 + 4)/(3 + 4 + 5 + 7)”。
与“ r2”,“ r3”和“ r3”相同。因此,例如对于第二行,计算为“(2 + 3)/(2 + 3 + 2 + 4)”。
如何在R中做到这一点?
答案 0 :(得分:3)
在整理数据后,这是基本的R解决方案,
df <- data.frame(m, stringsAsFactors = FALSE)
df[-1] <- lapply(df[-1], as.numeric)
df$new <- df$X2 + df$X3
with(df, ave(new, X1, FUN = function(i)i / sum(i)))
#[1] 0.3684211 0.4545455 0.4545455 0.3125000 0.6315789 0.5454545 0.5454545 0.6875000
答案 1 :(得分:2)
首先,将数据创建为data.frame而不是矩阵,以使数字列不强制转换为字符。 (如果您已经创建了矩阵,也可以使用sotos的答案的前两行将其从矩阵转换为data.frame)
df <- data.frame(row_id = c("r1","r2","r3","r4","r1","r2","r3","r4"),
v1 = c(3,2,5,2,5,2,6,4),
v2 = c(4,3,5,3,7,4,6,7))
现在,如果您使用setDT
将data.frame转换为data.table,则可以使用data.table分组(by = row_id
设置组)来执行此操作
library(data.table)
setDT(df)
df[, ratio := (v1 + v2)/sum(v1 + v2), by = row_id]
df
# row_id v1 v2 ratio
# 1: r1 3 4 0.3684211
# 2: r2 2 3 0.4545455
# 3: r3 5 5 0.4545455
# 4: r4 2 3 0.3125000
# 5: r1 5 7 0.6315789
# 6: r2 2 4 0.5454545
# 7: r3 6 6 0.5454545
# 8: r4 4 7 0.6875000
答案 2 :(得分:1)
m <- cbind(c("r1","r2","r3","r4","r1","r2","r3","r4"),c(3,2,5,2,5,2,6,4),c(4,3,5,3,7,4,6,7))
require(dplyr)
m %>% as_tibble %>%
mutate(V4 = as.numeric(V2) + as.numeric(V3)) %>%
group_by(V1) %>%
mutate(conditional_sum = sum(V4)) %>%
ungroup %>%
mutate(calculation = V4/conditional_sum)
# A tibble: 8 x 6
# V1 V2 V3 V4 conditional_sum calculation
# <chr> <chr> <chr> <dbl> <dbl> <dbl>
# 1 r1 3 4 7 19 0.368
# 2 r2 2 3 5 11 0.455
# 3 r3 5 5 10 22 0.455
# 4 r4 2 3 5 16 0.312
# 5 r1 5 7 12 19 0.632
# 6 r2 2 4 6 11 0.545
# 7 r3 6 6 12 22 0.545
# 8 r4 4 7 11 16 0.688