Question

我想找到被归类为answer_options的群体的平均值。不幸的是，我甚至无法构建解决问题的结构。

 answer_options <- c(3,3,3,2,2,4,4,4,4)
 options <- c(33,32,31,10,15,5,5,6,6)
 dd <- data.matrix(cbind(answer_options,options))

为了计算然后找到组的平均值，我需要找到第一组有3个值为32,32,31。然后通过1.group计算第一个均值然后开始answer_options [1 + 3] - 这是2-然后再重复一次。

例如：

1.group：c（3,3,3）及其平均值等于平均值（33,32,31） 2.group：c（2,2）及其平均值等于平均值（10,15） 3.group：c（4,4,4,4）及其平均值等于平均值（5,5,6,6）

然后我需要计算均值的平均值。

 c3 <- answer_options
##for i do not know how? 
 a1 <- c3[1]+1 
 a2 <- c3[a1]
 a3 <- c3[a1+c3[a1]]
 a4 <- c3[c3[a1+c3[a1]]]
 a5 <- c3[c3[1]+1 +c3[a1]+c3[a1+c3[a1]]]

序列应该是这样的：

1
C3 [1]
C3 [1 + 2。]
C3 [1 + 2 + 3。] 。。

我对此问题感到不安，希望你能帮助我！非常感谢。

编辑：为了清楚地问我的问题，我编辑了一些其他信息。

Answer 1

我不确定数据框是否适合您，而不是矩阵。我使用dplyr做你要问的事。我不是专家程序员，所以这可能效率低下。

answer_options <- c(3,3,3,2,2,4,4,4,4)
options <- c(33,32,31,10,15,5,5,6,6)
dd <- data.frame(cbind(answer_options,options))

在dplyr中使用％＆gt;％管道功能可以在数据框中显示摘要信息：

   library(dplyr)
   new.dd <- dd %>% group_by(answer_options) %>% 
    summarise(n=n(),
              mean_answer_options=mean(options))


     answer_options     n mean_answer_options
           (dbl) (int)               (dbl)
1              2     2                12.5
2              3     3                32.0
3              4     4                 5.5

然后合并两个表。

merged.dd<-left_join(dd,new.dd,by="answer_options")
merged.dd
  answer_options options n mean_answer_options
1              3      33 3                32.0
2              3      32 3                32.0
3              3      31 3                32.0
4              2      10 2                12.5
5              2      15 2                12.5
6              4       5 4                 5.5
7              4       5 4                 5.5
8              4       6 4                 5.5
9              4       6 4                 5.5

编辑在此处发表评论

您需要另一个变量来唯一标识要汇总的每个案例。如“问题”。

question<-c(1,1,1,2,2,3,3,3,3,4,4,4,4)
answer_options <- c(3,3,3,2,2,4,4,4,4,4,4,4,4)
options <- c(33,32,31,10,15,5,5,6,6,1,1,2,2)

dd <- data.frame(cbind(question,answer_options,options)) 
dd

library(dplyr)
new.dd <- dd %>% group_by(question) %>% 
    summarise(n=n(),mean_options_question=mean(options))
new.dd

merged.dd<-left_join(dd,new.dd,by="question")
merged.dd

这将为您提供以下输出。

   question answer_options options n mean_options_question
1         1              3      33 3                  32.0
2         1              3      32 3                  32.0
3         1              3      31 3                  32.0
4         2              2      10 2                  12.5
5         2              2      15 2                  12.5
6         3              4       5 4                   5.5
7         3              4       5 4                   5.5
8         3              4       6 4                   5.5
9         3              4       6 4                   5.5
10        4              4       1 4                   1.5
11        4              4       1 4                   1.5
12        4              4       2 4                   1.5
13        4              4       2 4                   1.5

Answer 2

根据您的问题，您想要计算群组均值的均值，我是否正确？如果是这样，以下代码将首先计算每个组的均值（请注意，我将您的输入转换为数据帧而不是矩阵）：

# Your input as a dataframe and not a matrix
> answer_options <- c(3,3,3,2,2,4,4,4,4)
> options <- c(33,32,31,10,15,5,5,6,6)
> dd <- data.frame(cbind(answer_options,options))

# Calculates the mean of each group and puts it into a "mean_ 
# _answer_options" vector
> mean_answer_options = by(dd$options,answer_options, FUN = mean)
> mean_answer_options
answer_options: 2
[1] 12.5
 -------------------------------------------------------------------------------------------
answer_options: 3
[1] 32
-------------------------------------------------------------------------------------------- 
answer_options: 4
[1] 5.5

您可以使用以下命令计算每组平均值的平均值：

> mean(as.numeric(mean_answer_options))
[1] 16.66667

这为每组的平均值生成16.66667的正确平均值。这可以通过以下方式进行交叉检查：

> (12.5+32+5.5)/3
[1] 16.66667

如果这不是你所要求的，你能否澄清一些我可能误解的事情？希望这有帮助！

在R中查找具有循环的组的平均值

2 个答案: