Question

我有一个包含多列和多行的数据框，我的目标是为它们中的每一个添加一个新列，紧跟在它之后，它与列总和的比例。

我有类似的东西：

我试图将其转换为：

a a.2 b b.2 c c.2
1 0.1 4 0.4 5 0.5 
8 0.8 2 0.2 3 0.3
1 0.1 4 0.4 2 0.2

但我无法找到一种方法来在循环内的add_column中为这些新列命名。

到目前为止，我的代码如下：

j=1
while (j <= length(colnames(eleicao))) {
  i <- colnames(sample)[j]
  nam <- paste("prop", i, sep = ".")
  j=j+1
  sample <- add_column(sample, parse(nam) = as.list(sample[i]/colSums(sample[i]))[[1]] .after = i)
}

我总是遇到同样的问题：Error: Column 'nam' already exists。

我如何实现目标？如何让add_column了解我正在尝试使用'nam'的值来命名列？

Answer 1

一点点草率的解决方案（使用apply）：

# Using OPs data stored in df
res <- do.call(cbind, apply(df, 2, function(x) data.frame(x, y = x / sum(x))))
#   a.x a.y b.x b.y c.x c.y
# 1   1 0.1   4 0.4   5 0.5
# 2   8 0.8   2 0.2   3 0.3
# 3   1 0.1   4 0.4   2 0.2

# Name
colnames(res) <- sub(".x", "", sub(".y", ".2", names(res)))

Answer 2

以下是使用prop.table

的选项

cbind(df1, prop.table(as.matrix(df1), 2))[order(rep(names(df1), 2))]
#  a a.1 b b.1 c c.1
#1 1 0.1 4 0.4 5 0.5
#2 8 0.8 2 0.2 3 0.3
#3 1 0.1 4 0.4 2 0.2

Answer 3

以下解决方案依赖于tidyverse中包含的postgres=# SELECT substring_index('www.mysql.com', '.', 2); substring_index ----------------- www.mysql (1 row) postgres=# SELECT substring_index('www.mysql.com', '.', -2); substring_index ----------------- mysql.com (1 row)。

dplyr

返回

library(tidyverse)

df <- tibble(
  a = c(1, 8, 1),
  b = c(4, 2, 4),
  c = c(5, 3, 2)
)

df %>% 
  mutate_all(funs(prop = . / sum(.)))

为每列添加一个比例列

3 个答案: