根据总计行将总计表更改为百分比表

时间:2018-08-20 22:13:38

标签: r percentage

我有这样的数据:

a <- data.frame("Color" = c("Blue", "Red", "Green", "Total"),
                "N_Likes" = c(5, 4, 1, 10),
                "N_Dislikes" = c(2, 4, 2, 8))

看起来像这样:

  Color N_Likes N_Dislikes
1  Blue       5          2
2   Red       4          4
3 Green       1          2
4 Total      10          8

此数据为总计,我想将其转换为百分比。

我想将其转换为如下形式:

  Color N_Likes N_Dislikes
1  Blue       50%          25%
2   Red       40%          50%
3 Green       10%          25%
4 Total      100%          100%

表中的每个值都是基于总数的百分比。
我知道我可以手动进行操作,但是有没有办法轻松进行此更改?

更新

此外,如果有NA值,我想忽略这些值,不要理会它们:

  Color N_Likes N_Dislikes  N_Neutral
1  Blue       5          2          1
2   Red       4          4         NA
3 Green       1          2          2
4 Total      10          8          3

会导致:

  Color   N_Likes   N_Dislikes   N_Neutral
1  Blue       50%          25%      33.33%
2   Red       40%          50%          NA
3 Green       10%          25%      66.66%
4 Total      100%          100%       100%

3 个答案:

答案 0 :(得分:3)

您可以使用lapply

遍历数字列
col_idx <- sapply(a, is.numeric) # find positions of numeric columns
a[, col_idx] <- lapply(a[, col_idx], function(x) {
  ifelse(is.na(x), NA, paste0(x / max(x, na.rm = TRUE) * 100, "%"))
})
a
#  Color N_Likes N_Dislikes
#1  Blue     50%        25%
#2   Red     40%        50%
#3 Green     10%        25%
#4 Total    100%       100%

答案 1 :(得分:3)

另一种dplyr解决方案:

a <- data.frame("Color" = c("Blue", "Red", "Green", "Total"),
                "N_Likes" = c(5, 4, 1, 10),
                "N_Dislikes" = c(2, 4, 2, 8))

library(dplyr)

a %>% mutate_at(vars(matches("N")), ~paste0(round(100*./last(.), 2), "%"))

#     Color N_Likes N_Dislikes
#   1  Blue     50%        25%
#   2   Red     40%        50%
#   3 Green     10%        25%
#   4 Total    100%       100%

我在假设last(.)始终位于数据帧的最后一行的情况下使用Total

对于NA,您可以使用:

a %>% mutate_at(vars(matches("N")), 
                ~ifelse(is.na(.), "NA", paste0(round(100*./last(.), 2), "%")))

如果您想拥有"NA"(字符值),或者可以使用:

a %>% mutate_at(vars(matches("N")), 
                ~ifelse(is.na(.), NA, paste0(round(100*./last(.), 2), "%")))
您想要拥有适当的NA(缺少值;不是字符串“ NA”)的

答案 2 :(得分:2)

使用dplyr

library(dplyr)
a %>% mutate_if(is.numeric, ~sprintf("%3.0f%%", .x / .x[length(.x)] * 100))
#  Color N_Likes N_Dislikes
#1  Blue     50%        25%
#2   Red     40%        50%
#3 Green     10%        25%
#4 Total    100%       100%

NA s处理修订后的数据

df %>% mutate_if(is.numeric, ~if_else(!is.na(.x), sprintf("%3.0f%%", .x / .x[length(.x)] * 100), "NA"))