Question

我目前正在尝试遍历名为＆＃34; test＆＃34;的数据框。并获得每列中每个数字的总和。恩。 sum（test $ colname1 = 4）将返回1，因为第一列中只出现了4次。我不能使用计数因为我有7个特定的数字我想保存（1-7）即使数字出现0次也是如此。计数仅返回已发生的数字，并且不显示0值。我的计划是使用apply函数循环遍历我的列，并对7个数字中的每一个进行求和，并将总和保存到一个值中，然后将这些组合值返回到数据框中，我可以在其中进行进一步的计算。应用需要一个函数，所以我决定做这样的事情：

final_results <- cbind(final_results, apply(test, 2, applyFunction(indexOfApply))

applyFunction <- function(indexofApply) {
temp <- c(sum(indexofApply == 1), sum(indexofApply == 2), 
sum(indexofApply == 3), sum(indexofApply == 4), 
sum(indexofApply == 5), sum(indexofApply == 6), sum(indexofApply == 7))

return(temp)
}

我希望我的结果看起来像这样：

我的原始数据框如下所示（列名保密）

有没有办法将apply函数的索引传递给我自己的函数，就像我想要的那样，还是有一些更容易的方法来做到这一点？我是R的新人，感觉应该有更好的方法来做到这一点。请解释你给出的任何答案，以便我学习。谢谢。

Answer 1

您可以查看tabulate功能。它将完全符合applyFunction可以做的

例如，如果我使用以下示例：

test
   a b c d e
1  1 1 1 7 4
2  6 2 7 7 1
3  1 4 5 3 7
4  3 7 4 7 7
5  2 7 5 1 2
6  1 4 2 1 2
7  1 1 5 2 1
8  3 5 4 2 4
9  6 6 6 3 1
10 4 1 1 5 2
11 6 5 7 1 6
12 1 1 5 4 7

然后使用sapply函数，该函数与apply(x,2,fun)：

相同

result = as.data.frame(sapply(test, tabulate, 7))

你可以得到：

result
  a b c d e
1 5 4 2 3 3
2 1 1 1 2 3
3 2 0 0 2 0
4 1 2 2 1 2
5 0 2 4 1 0
6 3 1 1 0 1
7 0 2 2 3 3

tabulate的缺点是它只能处理正整数。如果您的类别名称不是严格的1到7，那么您可以将列转换为因子，然后使用table来处理它。这是我的代码：

result2 <- data.frame(sapply(test, function(x) table(factor(x,levels=1:7))))

result2与result相同，但您可以通过将类别名称分配给levels来更改类别名称

Answer 2

   # simulating a data set
 df <- data.frame(col1 = sample(1:10, 10, replace = T),
             col2 = sample(1:10, 10, replace = T),
             col3 = sample(1:10, 10, replace = T))


my_vals = 1:7

 # a shell for the results

df2 <- as.data.frame(matrix(rep(0,21), ncol = 3))


for (i in 1:length(my_vals))
{ for (j in 1:ncol(df))
   { df2[i,j] <- sum(df[,j] == my_vals [i])}
}

names(df2) <- names(df)

df2


col1 col2 col3
1    2    0    1
2    0    1    0
3    0    2    1
4    0    0    0
5    1    1    3
6    2    2    2
7    1    3    1

将应用函数的索引传递给R中的FUN，还是应该做一些不同的事情？

2 个答案: