将列除以另一列dplyr R

时间:2019-12-14 11:19:37

标签: r dplyr divide

我在桌子上这样做:

tmp %>%
mutate(sum_onCPA = rowSums(select(., setdiff(colnames(.),NON_CPA_VARIABLES)))) %>%
mutate_at(vars(CPA_A01: CPA_U), (./ sum_onCPA))

所以我想将CPA_A01到CPA_U的每一列(65列)除以列的总和(sum_onCPA),但出现错误

Error in is_fun_list(.funs) : object 'sum_onCPA' not found

有什么主意吗?

1 个答案:

答案 0 :(得分:1)

您可以致电。$ sum_onCPA:

set.seed(100)
tmp = data.frame(matrix(runif(25),ncol=5))
NON_CPA_VARIABLES = c("X1","X5")

tmp = tmp %>% 
mutate(sum_onCPA = rowSums(select(., setdiff(colnames(.),NON_CPA_VARIABLES)))) 

你可以

tmp %>% mutate_at(vars(X2:X4),function(i)i/.$sum_onCPA)

致谢@ronakshah,他指出了一个更整洁的版本:

tmp %>% mutate_at(vars(X2:X4),~.x/sum_onCPA)

          X1        X2        X3        X4        X5 sum_onCPA
1 0.30776611 0.2721193 0.3515583 0.3763224 0.5358112  1.777789
2 0.25767250 0.4277649 0.4644980 0.1077371 0.7108038  1.899180
3 0.55232243 0.3673089 0.2780738 0.3546173 0.5383487  1.008199
4 0.05638315 0.4189724 0.3054667 0.2755609 0.7489722  1.304522
5 0.46854928 0.1048991 0.4698105 0.4252905 0.4201015  1.623104

我们可以使用基数R扫描来检查以上是否正确:

tmp[,c("X2","X3","X4")] = sweep(tmp[,c("X2","X3","X4")],1,tmp$sum_onCPA,"/")
tmp
              X1        X2        X3        X4        X5 sum_onCPA
1 0.30776611 0.2721193 0.3515583 0.3763224 0.5358112  1.777789
2 0.25767250 0.4277649 0.4644980 0.1077371 0.7108038  1.899180
3 0.55232243 0.3673089 0.2780738 0.3546173 0.5383487  1.008199
4 0.05638315 0.4189724 0.3054667 0.2755609 0.7489722  1.304522
5 0.46854928 0.1048991 0.4698105 0.4252905 0.4201015  1.623104