按列发出总和并使用R计算百分比

时间:2019-01-10 21:25:46

标签: r sum

我有如下数据表。我想获得如下输出。 (创建一个名为“百分比”的行,并简单地计算每年的总和中的“ S”百分比)。请参见下面的输出表。

如何实现这是R数据表方法?

感谢您的帮助。

Table:

Category   1998  1999  2000  2001  2002 .....  2018
No_History 10    15    2     22    15   .....  16
NS         17    20    15    23    10   .....  21
S          15    14    85    25    47   ...... 15


Output:

    Category    1998  1999  2000  2001  2002 .....  2018
    No_History  10    15    2     22    15   .....  16
    NS          17    20    15    23    10   .....  21
    S           15    14    85    25    47   .....  15
    Percentage  35.7  28.5  83.3  35.7  65.2 .....  28.8

Simply calculate percentage = S/(No_History+NS+S)*100

1 个答案:

答案 0 :(得分:0)

也许是这样的。首先,我创建一个数据框。

# Create data frame
df <- read.table(text ="Category   1998  1999  2000  2001  2002 2018
No_History 10    15    2     22    15   16
NS         17    20    15    23    10   21
S          15    14    85    25    47   15", header = FALSE)

然后,我必须将其重组为有用的格式。使用tidy format使生活更轻松。

# Restructure data:
# Transpose
# Use first row as column names
# Remove first row
# Convert to data table
# Convert columns to numerics
df <- t(df)
colnames(df) <- df[1, ]
df <- df[-1,]
dt <- data.table(df)
dt[, names(dt) := lapply(.SD, as.numeric)]

最后,我进行计算:

# Do calculation
dt[, Percentage := 100 * S/(No_History + NS + S)]

给予

#    Category No_History NS  S Percentage
# 1:     1998         10 17 15   35.71429
# 2:     1999         15 20 14   28.57143
# 3:     2000          2 15 85   83.33333
# 4:     2001         22 23 25   35.71429
# 5:     2002         15 10 47   65.27778
# 6:     2018         16 21 15   28.84615

要将数据恢复为您指定的格式,我必须转置数据表。

# Transpose back to desired format
t(dt)

#                  [,1]       [,2]       [,3]       [,4]       [,5]       [,6]
# Category   1998.00000 1999.00000 2000.00000 2001.00000 2002.00000 2018.00000
# No_History   10.00000   15.00000    2.00000   22.00000   15.00000   16.00000
# NS           17.00000   20.00000   15.00000   23.00000   10.00000   21.00000
# S            15.00000   14.00000   85.00000   25.00000   47.00000   15.00000
# Percentage   35.71429   28.57143   83.33333   35.71429   65.27778   28.84615

如果不是必须要使用data.table,也可以使用dplyr

# Create data frame
df <- read.table(text ="Category   1998  1999  2000  2001  2002 2018
No_History 10    15    2     22    15   16
NS         17    20    15    23    10   21
S          15    14    85    25    47   15", header = FALSE)

# Restructure data:
# Transpose
# Use first row as column names
# Remove first row
df <- t(df)
colnames(df) <- df[1, ]
df <- df[-1,]

# Convert to data frame
# Convert all to numeric
# Perform calculation
# Transpose result
df %>% 
  data.frame %>% 
  mutate_all(function(x)as.numeric(as.character(x))) %>% 
  mutate(Percentage = 100 * S /(No_History + NS + S)) %>% 
  t

#                  [,1]       [,2]       [,3]       [,4]       [,5]       [,6]
# Category   1998.00000 1999.00000 2000.00000 2001.00000 2002.00000 2018.00000
# No_History   10.00000   15.00000    2.00000   22.00000   15.00000   16.00000
# NS           17.00000   20.00000   15.00000   23.00000   10.00000   21.00000
# S            15.00000   14.00000   85.00000   25.00000   47.00000   15.00000
# Percentage   35.71429   28.57143   83.33333   35.71429   65.27778   28.84615