如何计算R中tbl列表中多列的百分比?

时间:2016-09-06 14:55:19

标签: r dplyr percentage tidyr

假设我有一个列表(tbl格式),如下所示:

       Type.1 count averageTotal averageHP averageAttack averageDefense averageSpAtk averageSp..Def averageSpeed
       (fctr) (int)        (dbl)     (dbl)         (dbl)          (dbl)        (dbl)          (dbl)        (dbl)
 1       Bug    69     378.9275  56.88406      70.97101       70.72464     53.86957       64.79710     61.68116
 2      Dark    31     445.7419  66.80645      88.38710       70.22581     74.64516       69.51613     76.16129
 3    Dragon    32     550.5312  83.31250     112.12500       86.37500     96.84375       88.84375     83.03125
 4  Electric    44     443.4091  59.79545      69.09091       66.29545     90.02273       73.70455     84.50000
 5     Fairy    17     413.1765  74.11765      61.52941       65.70588     78.52941       84.70588     48.58824

如果我想计算averageTotal列中每列的百分比(对于每一行),我将如何进行?

具体来说,我想要的结果是这样的:

      Type.1 count averageTotal averageHP averageAttack averageDefense averageSpAtk averageSp..Def averageSpeed
      (fctr) (int)        (dbl)     (dbl)         (dbl)          (dbl)        (dbl)          (dbl)        (dbl)
1       Bug    69     378.9275    15.02%       18.73%         18.73%        14.21%       17.11%     16.29%

2 个答案:

答案 0 :(得分:4)

这是基础R中的一个简单方法:

df[, 4:9] <- df[, 4:9] / df[[3]]

返回

df
    Type.1 count averageTotal averageHP averageAttack averageDefense averageSpAtk averageSp..Def averageSpeed
1      Bug    69     378.9275 0.1501186     0.1872944      0.1866443    0.1421633      0.1710013    0.1627783
2     Dark    31     445.7419 0.1498770     0.1982921      0.1575481    0.1674627      0.1559560    0.1708641
3   Dragon    32     550.5312 0.1513311     0.2036669      0.1568939    0.1759096      0.1613782    0.1508202
4 Electric    44     443.4091 0.1348539     0.1558175      0.1495131    0.2030241      0.1662225    0.1905689
5    Fairy    17     413.1765 0.1793850     0.1489180      0.1590262    0.1900626      0.2050114    0.1175968

数字指的是列位置,因此第三列将第4列分为第9列。这是以比例而不是百分比报告的,但您可以使用

轻松解决这个问题
df[, 4:9] <- round(100 * df[, 4:9] / df[[3]], 2)

数据

df <- read.table(header=TRUE, text="       Type.1 count averageTotal averageHP averageAttack averageDefense averageSpAtk averageSp..Def averageSpeed
                 1       Bug    69     378.9275  56.88406      70.97101       70.72464     53.86957       64.79710     61.68116
                 2      Dark    31     445.7419  66.80645      88.38710       70.22581     74.64516       69.51613     76.16129
                 3    Dragon    32     550.5312  83.31250     112.12500       86.37500     96.84375       88.84375     83.03125
                 4  Electric    44     443.4091  59.79545      69.09091       66.29545     90.02273       73.70455     84.50000
                 5     Fairy    17     413.1765  74.11765      61.52941       65.70588     78.52941       84.70588     48.58824")

答案 1 :(得分:1)

使用dplyr,您可以使用mutate_at指定要更改的列,并在funs中定义自定义函数,其中.表示要变异的列:

df %>% mutate_at(vars(averageHP:averageSpeed), funs(. / averageTotal * 100))

## # A tibble: 5 × 9
##     Type.1 count averageTotal averageHP averageAttack averageDefense averageSpAtk averageSp..Def
##     <fctr> <int>        <dbl>     <dbl>         <dbl>          <dbl>        <dbl>          <dbl>
## 1      Bug    69     378.9275  15.01186      18.72944       18.66443     14.21633       17.10013
## 2     Dark    31     445.7419  14.98770      19.82921       15.75481     16.74627       15.59560
## 3   Dragon    32     550.5312  15.13311      20.36669       15.68939     17.59096       16.13782
## 4 Electric    44     443.4091  13.48539      15.58175       14.95131     20.30241       16.62225
## 5    Fairy    17     413.1765  17.93850      14.89180       15.90262     19.00626       20.50114
## # ... with 1 more variables: averageSpeed <dbl>