我正在尝试使用R中的列表作为计算篮球队胜率的词典。基本上,对于每次胜利,我想增加适当的字典数量,并且对于每个游戏,我想增加适当的字典数量。不知何故,我得到的答案似乎合理但是不正确,我无法弄清楚为什么程序在逻辑上不能给出预期的输出。任何建议或提示将不胜感激。我正在使用的代码如下:
games <- read.csv(game_pathname, header = FALSE)
names(games) <- c("GameDate", "DateCount", "HomeID", "AwayID", "HomePts", "AwayPts", "HomeAbbr", "AwayAbbre", "HomeName", "AwayName")
wins = list()
total = list()
for (team in unique(games$HomeName)) {
wins[team] <- 0
total[team] <- 0
}
for (i in 1:nrow(games)) {
if (games$HomePts[i] > games$AwayPts[i]) {
wins[games$HomeName[i]] <- wins[[games$HomeName[i]]] + 1
} else {
wins[games$AwayName[i]] <- wins[[games$AwayName[i]]] + 1
}
total[games$HomeName[i]] <- total[[games$HomeName[i]]] + 1
total[games$AwayName[i]] <- total[[games$AwayName[i]]] + 1
}
for (team in unique(games$HomeName)) {
print(paste(team, wins[[team]] / total[[team]]))
}
答案 0 :(得分:0)
当我查看代码并通过创建玩具示例时,该算法没有问题。在下面的模拟中,我使用了三支球队,其中一支完全放松,另一支球队盈亏平衡,第三支是冠军。
games <- data.frame(HomeName = c("a", "b", "c"),
HomePts = c(1, 2, 3),
AwayPts = c(3, 1, 2),
AwayName = c("c", "a", "b") )
wins = list()
total = list()
for (team in unique(games$HomeName)) {
wins[team] <- 0
total[team] <- 0
}
for (i in 1:nrow(games)) {
if (games$HomePts[i] > games$AwayPts[i]) {
wins[games$HomeName[i]] <- wins[[games$HomeName[i]]] + 1
} else {
wins[games$AwayName[i]] <- wins[[games$AwayName[i]]] + 1
}
total[games$HomeName[i]] <- total[[games$HomeName[i]]] + 1
total[games$AwayName[i]] <- total[[games$AwayName[i]]] + 1
}
for (team in unique(games$HomeName)) {
print(paste(team, wins[[team]] / total[[team]]))
}
games
wins
total
您的算法的输出如下:
[1] "a 0"
[1] "b 0.5"
[1] "c 1"
> games
HomeName HomePts AwayPts AwayName
1 a 1 3 c
2 b 2 1 a
3 c 3 2 b
> wins
$`a`
[1] 0
$b
[1] 1
$c
[1] 2
> total
$`a`
[1] 2
$b
[1] 2
$c
[1] 2
但是,使用for
并使用列表索引进行直接操作的方式不是“ R样式”,而是“ comme il faut”:)
例如,您可以获得类似的结果。 dplyr
包,是Soroush Asadi包的一部分。下面的代码是对游戏结果的比较,然后将其分为两个数据帧并逐行合并。最后,按队名分组并计算平均胜率。请看下面:
library(dplyr)
df <- games %>% mutate(hwins = (HomePts > AwayPts), awins = !hwins)
df_home <- df %>% select(HomeName, hwins) %>% rename(name = HomeName, wins = hwins)
df_away <- df %>% select(AwayName, awins) %>% rename(name = AwayName, wins = awins)
df <- bind_rows(df_home, df_away) %>% group_by(name) %>% summarise(mean_wins = mean(wins))
df
输出:
# A tibble: 3 x 2
name mean_wins
<fct> <dbl>
1 a 0
2 b 0.5
3 c 1