总结R

时间:2018-12-17 22:07:25

标签: r average frame summary

我正在尝试在 R 中创建一个数据框,该数据框将总结足球队基于另一组的投注赔率。

例如,这个非常小的样本表包含主队和客队的比赛以及各自的比赛赔率。

matchData:

键:HWO(主场获胜赔率),DO(开局赔率),AWO(客胜赔率)

+----------+----------+------+------+------+
| HomeTeam | AwayTeam | HWO  |  DO  | AWO  |
+----------+----------+------+------+------+
| TeamA    | TeamB    | 1.30 | 5.20 | 9.50 |
| TeamC    | TeamD    | 1.59 | 4.20 | 6.30 |
| TeamE    | TeamF    | 3.00 | 5.50 | 1.70 |
| TeamB    | TeamA    | 1.50 | 4.50 | 8.70 |
| TeamD    | TeamC    | 1.25 | 4.20 | 8.00 |
| TeamF    | TeamE    | 1.40 | 5.00 | 7.20 |
+----------+----------+------+------+------+

以下是此数据帧的内容:

structure(list(HomeTeam = c("TeamA", "TeamC", "TeamE", "TeamB", 
"TeamD", "TeamF"), AwayTeam = c("TeamB", "TeamD", "TeamF", "TeamA", 
"TeamC", "TeamE"), HWO = c(1.3, 1.59, 3, 1.5, 1.25, 1.4), DO = c(5.2, 
4.2, 5.5, 4.5, 4.2, 5), AWO = c(9.5, 6.3, 1.7, 8.7, 8, 7.2)), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))

我需要的其他数据框是平均每支球队的赔率的框。必须考虑该球队是主场还是客场,并在每次比赛中使用合适的数字。

下面显示了决赛桌的外观:

oddsSummary:

键:AvgWO(平均赔率),AvgDO(平均赔率),AvgLO(平均赔率)

+-------+------+------+------+
| Team  | AvgWO|AvgDO |AvgLO |
+-------+------+------+------+
| TeamA | 5.00 | 4.85 | 5.50 |
| TeamB | 5.50 | 4.85 | 5.00 |
| TeamC | 4.80 | 4.20 | 3.78 |
| TeamD | 3.78 | 4.20 | 4.80 |
| TeamE | 5.10 | 5.25 | 1.55 |
| TeamF | 1.55 | 5.25 | 5.10 |
+-------+------+------+------+

以下是此数据帧的内容:

structure(list(Team = c("TeamA", "TeamB", "TeamC", "TeamD", "TeamE", 
"TeamF"), AvgWO = c(5, 5.5, 4.8, 3.78, 5.1, 1.55), AvgDO = c(4.85, 
4.85, 4.2, 4.2, 5.25, 5.25), AvgLO = c(5.5, 5, 7.55, 4.8, 2, 
5.1)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"
))

例如,以TeamA为例...

  • 累加TeamA的获胜几率。如果他们在家里玩,请添加HWO数字;如果他们在家里玩,请添加AWO数字。然后除以他们参加比赛的总数。

    • 例如(1.30 + 8.70)÷2)= 5.00
  • 将TeamA的抽奖赔率相加并除以他们参加比赛的总数。

    • 例如(5.20 + 4.50)÷2 = 4.85
  • 加总TeamA的赔率。如果他们在家里玩,请添加AWO数字;如果他们在家里玩,请添加HWO数字。然后除以他们参加比赛的总数。

    • 例如(9.50 + 1.50)÷2 = 5.50

任何对此有解决方案的人,我将不胜感激。请确保这是一个可靠的解决方案,可以应付不同数量的游戏等。

1 个答案:

答案 0 :(得分:0)

有趣的问题。这里是解决方案,对于TeamC和TeamE的avgLO得出不同的答案,但是根据您的描述,我认为我的以下解决方案是正确的。因此,请仔细检查并告知我。

您可以改善命名等,但是希望对您有所帮助。

DF <-
  data.frame(
    HomeTeam = paste0("Team", c("A", "C", "E", "B", "D", "F")),
    AwayTeam = paste0("Team", c("B", "D", "F", "A", "C", "E")),
    HWO = c(1.3, 1.59, 3, 1.5, 1.25, 1.4),
    DO = c(5.2, 4.2, 5.5, 4.5, 4.2, 5),
    AWO = c(9.5, 6.3, 1.7, 8.7, 8, 7.2)
  )

library(magrittr)
library(dplyr)
library(reshape2)

DF %>%
  melt(c("HWO", "DO", "AWO"), value.name = "Team") %>%
  mutate(WO = ifelse(variable == "HomeTeam", HWO, AWO),
         LO = ifelse(variable == "HomeTeam", AWO, HWO)) %>%
  group_by(Team) %>%
  summarise(avgWO = mean(WO),
            avgDO = mean(DO),
            avgLO = mean(LO))

结果(如果不适合您,则转换为data.frame)

# A tibble: 6 x 4
  Team  avgWO avgDO avgLO
  <chr> <dbl> <dbl> <dbl>
1 TeamA  5     4.85  5.5 
2 TeamB  5.5   4.85  5   
3 TeamC  4.80  4.2   3.78
4 TeamD  3.78  4.2   4.80
5 TeamE  5.1   5.25  1.55
6 TeamF  1.55  5.25  5.1