R:如何使用清单?

时间:2012-02-24 11:13:01

标签: r

我正在尝试使用R中的列表作为计算篮球队胜率的词典。基本上,对于每次胜利,我想增加适当的字典数量,并且对于每个游戏,我想增加适当的字典数量。不知何故,我得到的答案似乎合理但是不正确,我无法弄清楚为什么程序在逻辑上不能给出预期的输出。任何建议或提示将不胜感激。我正在使用的代码如下:

games <- read.csv(game_pathname, header = FALSE)

names(games) <- c("GameDate", "DateCount", "HomeID", "AwayID", "HomePts", "AwayPts",     "HomeAbbr", "AwayAbbre", "HomeName", "AwayName")

wins = list()
total = list()

for (team in unique(games$HomeName)) {
    wins[team] <- 0
    total[team] <- 0
}

for (i in 1:nrow(games)) {
    if (games$HomePts[i] > games$AwayPts[i]) {
        wins[games$HomeName[i]] <- wins[[games$HomeName[i]]] + 1
    } else {
        wins[games$AwayName[i]] <- wins[[games$AwayName[i]]] + 1
    }
    total[games$HomeName[i]] <- total[[games$HomeName[i]]] + 1
    total[games$AwayName[i]] <- total[[games$AwayName[i]]] + 1
}

for (team in unique(games$HomeName)) {
    print(paste(team, wins[[team]] / total[[team]]))
}

1 个答案:

答案 0 :(得分:0)

当我查看代码并通过创建玩具示例时,该算法没有问题。在下面的模拟中,我使用了三支球队,其中一支完全放松,另一支球队盈亏平衡,第三支是冠军。

games <- data.frame(HomeName = c("a", "b", "c"),
                    HomePts = c(1, 2, 3),
                    AwayPts = c(3, 1, 2),
                    AwayName = c("c", "a", "b")                    )
wins = list()
total = list()

for (team in unique(games$HomeName)) {
  wins[team] <- 0
  total[team] <- 0
}

for (i in 1:nrow(games)) {
  if (games$HomePts[i] > games$AwayPts[i]) {
    wins[games$HomeName[i]] <- wins[[games$HomeName[i]]] + 1
  } else {
    wins[games$AwayName[i]] <- wins[[games$AwayName[i]]] + 1
  }
  total[games$HomeName[i]] <- total[[games$HomeName[i]]] + 1
  total[games$AwayName[i]] <- total[[games$AwayName[i]]] + 1
}

for (team in unique(games$HomeName)) {
  print(paste(team, wins[[team]] / total[[team]]))
}

games
wins
total

您的算法的输出如下:

[1] "a 0"
[1] "b 0.5"
[1] "c 1"

> games
  HomeName HomePts AwayPts AwayName
1        a       1       3        c
2        b       2       1        a
3        c       3       2        b

> wins
$`a`
[1] 0

$b
[1] 1

$c
[1] 2

> total
$`a`
[1] 2

$b
[1] 2

$c
[1] 2

但是,使用for并使用列表索引进行直接操作的方式不是“ R样式”,而是“ comme il faut”:)

例如,您可以获得类似的结果。 dplyr包,是Soroush Asadi包的一部分。下面的代码是对游戏结果的比较,然后将其分为两个数据帧并逐行合并。最后,按队名分组并计算平均胜率。请看下面:

library(dplyr)
df <- games %>% mutate(hwins = (HomePts > AwayPts), awins = !hwins)
df_home <- df %>% select(HomeName, hwins) %>% rename(name = HomeName, wins = hwins)
df_away <- df %>% select(AwayName, awins) %>% rename(name = AwayName, wins = awins)
df <- bind_rows(df_home, df_away) %>% group_by(name) %>% summarise(mean_wins = mean(wins))
df

输出:

# A tibble: 3 x 2
  name  mean_wins
  <fct>     <dbl>
1 a           0  
2 b           0.5
3 c           1