R - 获取特定字段的平均值表

时间:2016-02-28 17:21:10

标签: r

背景 在我的问题中,我有一个数据集,其中每一行代表一个特定潜水的特定潜水,特定潜水员。每个人都有自己的分数,并判断。

我想得到的是特定法官按国家给每个潜水员的平均分数。

数据: 为了说明我的意思:

  Event Round    Diver Country Rank DiveNo Difficulty JScore                   Judge JCountry
1 M3mSB Final XIONG Ni     CHN    1      1        3.1    8.0 RUIZ-PEDREGUERA Rolando      CUB
2 M3mSB Final XIONG Ni     CHN    1      1        3.1    9.0             GEAR Dennis      NZL
3 M3mSB Final XIONG Ni     CHN    1      1        3.1    8.5           BOYS Beverley      CAN
4 M3mSB Final XIONG Ni     CHN    1      1        3.1    8.5           JOHNSON Bente      NOR
5 M3mSB Final XIONG Ni     CHN    1      1        3.1    8.5         BOUSSARD Michel      FRA
6 M3mSB Final XIONG Ni     CHN    1      1        3.1    8.5          CALDERON Felix      PUR

我尝试了什么:似乎有效:

countries <- unique(x$Country[x$Judge==thisjudge])
AvgByCountry <- vector(mode = "numeric", length =length(countries))
for(i in 1:length(countries)){
  AvgByCountry[i] <- mean(x$JScore[x$Country[x$Judge==thisjudge]==countries[i]])
}
names(AvgByCountry) <- countries
AvgByCountry

问题: 我知道它可能不是最好的循环,但是有更好的方法吗?我尝试了子设置和其他一些事情,但没有一个能给我我想要的东西。

1 个答案:

答案 0 :(得分:2)

使用data.tabe

library(data.table)
set.seed(100)
DT <- data.table(X = rnorm(20),  "Country" = sample(c("US","UK"), 10, TRUE))

                 X Country
 1: -0.50219235      US
 2:  0.13153117      UK
 3: -0.07891709      UK
 4:  0.88678481      UK
 5:  0.11697127      UK
 6:  0.31863009      US
 7: -0.58179068      UK
 8:  0.71453271      UK
 9: -0.82525943      US
10: -0.35986213      US
11:  0.08988614      US
12:  0.09627446      UK
13: -0.20163395      UK
14:  0.73984050      UK
15:  0.12337950      UK
16: -0.02931671      US
17: -0.38885425      UK
18:  0.51085626      UK
19: -0.91381419      US
20:  2.31029682      US

DT[, Mean:=mean(X), by= 'Country']

          X Country       Mean


 1: -0.50219235      US 0.01104603
 2:  0.13153117      UK 0.17241456
 3: -0.07891709      UK 0.17241456
 4:  0.88678481      UK 0.17241456
 5:  0.11697127      UK 0.17241456
 6:  0.31863009      US 0.01104603
 7: -0.58179068      UK 0.17241456
 8:  0.71453271      UK 0.17241456
 9: -0.82525943      US 0.01104603
10: -0.35986213      US 0.01104603
11:  0.08988614      US 0.01104603
12:  0.09627446      UK 0.17241456
13: -0.20163395      UK 0.17241456
14:  0.73984050      UK 0.17241456
15:  0.12337950      UK 0.17241456
16: -0.02931671      US 0.01104603
17: -0.38885425      UK 0.17241456
18:  0.51085626      UK 0.17241456
19: -0.91381419      US 0.01104603
20:  2.31029682      US 0.01104603

或者,正如使用aggregate

建议的hotoverflow
 aggregate(X ~ Country, data = DT, mean)
  Country       X
1   UK     0.17241456
2   US     0.01104603

编辑更新为评论:

library(data.table)
set.seed(100)
DT <- data.table(X = rnorm(20),  "Country" = sample(c("US","UK"), 10, TRUE), "Judge" = sample(c("James","Nick"), 10, TRUE)

aggregate(X ~ Country + Judge, data = DT, mean)

  Country Judge          X
1      UK James  0.1828624
2      US James  0.3045736
3      UK  Nick  0.1201754
4      US  Nick -0.8695368

data.table方法

DT[, Mean:=mean(X), by= c('Country', 'Judge')]