Question

这是我的数据集（MergedData）在R中的样子的示例，其中我的每个参与者（5行）在每个测试（7列）中获得了分数。我想知道所有测试的总分（所有列），但是每个参与者（行）。

另外，我的完整数据集不仅仅包含这几个变量，所以如果可能的话，我希望使用公式＆amp;循环，而不必逐行/逐列。

Participant TestScores     
ParticipantA    2   4   2   3   2   3   4
ParticipantB    1   3   2   2   3   3   3
ParticipantC    1   4   4   2   3   4   2
ParticipantD    2   4   2   3   2   4   4
ParticipantE    1   3   2   2   2   2   2

我试过这个，但它不起作用：

Test_Scores <- rowSums(MergedData[Test1, Test2, Test3], na.rm=TRUE)

我收到以下错误消息：

Error in `[.data.frame`(MergedData, Test1, Test2, Test3,  : 
  unused arguments

我该如何解决这个问题？谢谢!!

Answer 1

我想你想要这个：

rowSums(MergedData[,c('Test1', 'Test2', 'Test3')], na.rm=TRUE)

Answer 2

请参阅?rowSums和?colSums的文档。

您的帖子中并不清楚MergedData究竟是什么。假设它是data.frame，则问题在于您的索引MergedData[Test1, Test2, Test3]。如果是data.frame，则您希望运行以下内容：

Test_Scores <- rowSums(MergedData, na.rm = TRUE)

或

Test_Scores <- rowSums(MergedData[, c("Test1", "Test2", "Test3")], na.rm = TRUE)

如果您只想使用名为"Test1"，"Test2"和"Test3"的列（如果确实这样命名的话）。

如果这不起作用。请告诉我们str(MergedData)的输出。

您需要提供一个最小的可重现的错误示例，以获得任何真正有用的答案。

Answer 3

您可以使用：

MergedData$Test_Scores_Sum <- rowSums(MergedData[,2:8], na.rm=TRUE)

2:8是您想要总结的所有列（测试）。这样，它将在您的数据中创建另一列。

这样您就不必键入每个列名称，您仍然可以在数据框中包含其他列，这些列将不会被总结。但请注意，您要总结的所有测试列应该彼此相邻（如示例数据中所示）。

Answer 4

对于小型数据，将data.frame转换为table然后使用addmargins()可能会很有趣。

使用此示例数据

MergedData<-data.frame(Participant=letters[1:5],
    Test1 = c(2,1,1,2,1),
    Test2 = c(4,3,4,4,3),
    Test3 = c(2,2,4,2,2),
    Test4 = c(3,2,2,3,2),
    Test5 = c(2,3,3,2,2)
)

和这个辅助函数

as.table.data.frame<-function(x, rownames=0) {
    numerics <- sapply(x,is.numeric)
    chars <- which(sapply(x,function(x) is.character(x) || is.factor(x)))
    names <- if(!is.null(rownames)) {
        if (length(rownames)==1) {
            if (rownames ==0) {
                 rownames(x)
            } else {
                as.character(x[,rownames])
            }
        } else {
            rownames
        }
    } else {
          if(length(chars)==1) {
            as.character(x[,chars])
        } else {
            rownames(x)
        }
    }
    x<-as.matrix(x[,numerics])
    rownames(x)<-names
    structure(x, class="table")
}

你可以做到

addmargins(as.table(MergedData))

获取

    Test1 Test2 Test3 Test4 Test5 Sum
a       2     4     2     3     2  13
b       1     3     2     2     3  11
c       1     4     4     2     3  14
d       2     4     2     3     2  13
e       1     3     2     2     2  10
Sum     7    18    12    12    12  61

在这种情况下可能不是非常有用，但仍然有用addmargins。

Answer 5

之前有四个答案，只有一个显示结果？那是怎么回事？这是一个

> dat <- read.table(header=T, text = 
  'Participant Test1 Test2 Test3 Test4 Test5 Test6 Test7     
  ParticipantA    2   4   2   3   2   3   4
  ParticipantB    1   3   2   2   3   3   3
  ParticipantC    1   4   4   2   3   4   2
  ParticipantD    2   4   2   3   2   4   4
  ParticipantE    1   3   2   2   2   2   2')

你写了那个

“...如果可能的话，我想使用公式和放大器进行操作，而不必按行按行>列排列”

您根本不必编写任何循环。行和列函数在所有行和所有列上运行，没有循环。

> rowSums(dat[-1], na.rm = TRUE)
## [1] 20 17 20 21 14
> colSums(dat[-1], na.rm = TRUE)
##  Test1  Test2  Test3  Test4  Test5  Test6  Test7 
##      7     18     12     12     12     16     15

Answer 6

以下是使用dplyr和reshape2

进行此操作的方法

dat <- read.table(header=T, text = 
                    'Participant Test1 Test2 Test3 Test4 Test5 Test6 Test7     
  ParticipantA    2   4   2   3   2   3   4
  ParticipantB    1   3   2   2   3   3   3
  ParticipantC    1   4   4   2   3   4   2
  ParticipantD    2   4   2   3   2   4   4
  ParticipantE    1   3   2   2   2   2   2')

library(dplyr) 
library(reshape2)    

# Melt data into long format
dat.l = melt(dat, id.var="Participant", variable.name="Test")    
> dat.l
    Participant  Test value
1  ParticipantA Test1     2
2  ParticipantB Test1     1
3  ParticipantC Test1     1
4  ParticipantD Test1     2
...
32 ParticipantB Test7     3
33 ParticipantC Test7     2
34 ParticipantD Test7     4
35 ParticipantE Test7     2

# Sum by Participant
dat.l %.%
  group_by(Participant) %.%
  summarise(Sum=sum(value))

   Participant Sum
1 ParticipantA  20
2 ParticipantB  17
3 ParticipantC  20
4 ParticipantD  21
5 ParticipantE  14

# Sum by Test
dat.l %.%
  group_by(Test) %.%
  summarise(Sum=sum(value))

   Test Sum
1 Test1   7
2 Test2  18
3 Test3  12
4 Test4  12
5 Test5  12
6 Test6  16
7 Test7  15

R中的行和列总和

6 个答案: