Question

我有以下数据

Exp =我的数据框

dt<-data.table(Game=c(rep(1,9),rep(2,3)),
               Round=rep(1:3,4),
               Participant=rep(1:4,each=3),
               Left_Choice=c(1,0,0,1,1,0,0,0,1,1,1,1),
               Total_Points=c(5,15,12,16,83,7,4,8,23,6,9,14))

> dt
    Game Round Participant Left_Choice Total_Points
 1:    1     1           1           1            5
 2:    1     2           1           0           15
 3:    1     3           1           0           12
 4:    1     1           2           1           16
 5:    1     2           2           1           83
 6:    1     3           2           0            7
 7:    1     1           3           0            4
 8:    1     2           3           0            8
 9:    1     3           3           1           23
10:    2     1           4           1            6
11:    2     2           4           1            9
12:    2     3           4           1           14

现在，我需要做以下事情：

首先，对于每个游戏中的每个参与者，我需要计算平均值“左选择率”。
之后我想把结果分成5组（左选择＆lt; 20％，选择20％到40％e.t.c），
对于每个小组（在每个游戏中），我想计算最后一轮中Total_Points **的平均值 - 在这个简单示例中的第3轮**** [仅限第3轮的值] - 例如对于参与者1，在第1轮中，第3轮中的总分数为12.而对于参与者4，在第2轮中则为14.

所以在第一阶段，我想我应该计算以下内容：

Game Participant Percent_left    Total_Points (in last round) 

1        1           33%            12
1        2           66%            7 
1        3           33%            23   
2        4           100%           14

最终结果应如下所示：

Game  Left_Choice Total_Poins (average)    
    1         >35%                   17.5= (12+23)/2
    1     <35%<70%                   7
    1         >70%                   NA
    2         >35%                   NA
    2     <35%<70%                   NA
    2         >70%                   14

请帮忙！ :)

Answer 1

在by

工作

1：简单组意味着dt[,pct_left:=mean(Left_Choice),by=.(Game,Participant)]

cut

2：使用include.lowest=T;不完全清楚，但我认为你想要dt[,pct_grp:=cut(pct_left,breaks=seq(0,1,by=.2),include.lowest=T)]。

by

3：使用dt[Round==max(Round),end_mean:=mean(Total_Points),by=.(pct_grp,Game)]

稍微复杂一些

.(end_mean=mean(Total_Points))

（如果您只想要缩小的表格，请改用dt[,end_mean:=mean(Total_Points),by=.(pct_grp,Game,Round)]）。

你没有说清楚是否有全球最大轮次数（即所有游戏是否以相同数量的轮次结束）;这是上面假设的。为了提供一个确切的替代方案，你必须更清楚这一点，但我建议从一轮一轮地定义：

{{1}}

R

1 个答案: