如何计算Python / Pandas中的连胜

时间:2019-06-05 00:24:50

标签: python pandas

我正在尝试计算进入游戏的胜利条纹或失败条纹。我的目标是根据这些连胜因素或最近的记录来做出投注决定。我是Python和Pandas(以及一般编程领域)的新手,所以欢迎对代码的详细解释。

这是我的数据

    Season               Game Date                   Game Index  Away Team               Away Score  Home Team             Home Score  Winner                Loser
 0  2014 Regular Season  Saturday, March 22, 2014    2014032201  Los Angeles Dodgers              3  Arizona D'Backs                1  Los Angeles Dodgers   Arizona D'Backs
 1  2014 Regular Season  Sunday, March 23, 2014      2014032301  Los Angeles Dodgers              7  Arizona D'Backs                5  Los Angeles Dodgers   Arizona D'Backs
 2  2014 Regular Season  Sunday, March 30, 2014      2014033001  Los Angeles Dodgers              1  San Diego Padres               3  San Diego Padres      Los Angeles Dodgers
 3  2014 Regular Season  Monday, March 31, 2014      2014033101  Seattle Mariners                10  Los Angeles Angels             3  Seattle Mariners      Los Angeles Angels
 4  2014 Regular Season  Monday, March 31, 2014      2014033102  San Francisco Giants             9  Arizona D'Backs                8  San Francisco Giants  Arizona D'Backs
 5  2014 Regular Season  Monday, March 31, 2014      2014033103  Boston Red Sox                   1  Baltimore Orioles              2  Baltimore Orioles     Boston Red Sox
 6  2014 Regular Season  Monday, March 31, 2014      2014033104  Minnesota Twins                  3  Chicago White Sox              5  Chicago White Sox     Minnesota Twins
 7  2014 Regular Season  Monday, March 31, 2014      2014033105  St. Louis Cardinals              1  Cincinnati Reds                0  St. Louis Cardinals   Cincinnati Reds
 8  2014 Regular Season  Monday, March 31, 2014      2014033106  Kansas City Royals               3  Detroit Tigers                 4  Detroit Tigers        Kansas City Royals
 9  2014 Regular Season  Monday, March 31, 2014      2014033107  Colorado Rockies                 1  Miami Marlins                 10  Miami Marlins         Colorado Rockies

以下词典:

{'Away Score': {0: 3, 1: 7, 2: 1, 3: 10, 4: 9},
 'Away Team': {0: 'Los Angeles Dodgers',
  1: 'Los Angeles Dodgers',
  2: 'Los Angeles Dodgers',
  3: 'Seattle Mariners',
  4: 'San Francisco Giants'},
 'Game Date': {0: 'Saturday, March 22, 2014',
  1: 'Sunday, March 23, 2014',
  2: 'Sunday, March 30, 2014',
  3: 'Monday, March 31, 2014',
  4: 'Monday, March 31, 2014'},
 'Game Index': {0: 2014032201,
  1: 2014032301,
  2: 2014033001,
  3: 2014033101,
  4: 2014033102},
 'Home Score': {0: 1, 1: 5, 2: 3, 3: 3, 4: 8},
 'Home Team': {0: "Arizona D'Backs",
  1: "Arizona D'Backs",
  2: 'San Diego Padres',
  3: 'Los Angeles Angels',
  4: "Arizona D'Backs"},
 'Loser': {0: "Arizona D'Backs",
  1: "Arizona D'Backs",
  2: 'Los Angeles Dodgers',
  3: 'Los Angeles Angels',
  4: "Arizona D'Backs"},
 'Season': {0: '2014 Regular Season',
  1: '2014 Regular Season',
  2: '2014 Regular Season',
  3: '2014 Regular Season',
  4: '2014 Regular Season'},
 'Winner': {0: 'Los Angeles Dodgers',
  1: 'Los Angeles Dodgers',
  2: 'San Diego Padres',
  3: 'Seattle Mariners',
  4: 'San Francisco Giants'}}

我尝试遍历整个赛季和整个团队,然后根据[this]:https://github.com/nhcamp/EPL-Betting/blob/master/EPL%20Match%20Results%20DF.ipynb github项目创建连胜数。

在构建循环的早期,我遇到了关键错误,并且无法识别数据

game_table = pd.read_csv('MLB_Scores_2014_2018.csv')

# Get Team List
team_list = game_table['Away Team'].unique()

# Get Season List
season_list = game_table['Season'].unique()

#Defining "chunks" to append gamedata to the total dataframe
chunks = []

for season in season_list:
    # Looping through seasons. Streaks reset for each season
    season_games = game_table[game_table['Season'] == season]

    for team in team_list:
        # Looping through teams
        season_team_games = season_games[(season_games['Away Team'] == team | season_games['Home Team'] == team)]

        #Setting streak list and streak counter values
        streak_list = []
        streak = 0

        # Looping through each game
        for game in season_team_games.iterrow():
            # Check if team is a winner, and up the streak
            if game_table['Winner'] == team:
                streak_list.append(streak)
                streak += 1
            # If not the winner, append streak and set to zero
            elif game_table['Winner'] != team:
                streak_list.append(streak)
                streak = 0
            # Just in case something wierd happens with the scores
            else:
                streak_list.append(streak)
        game_table['Streak'] = streak_list
        chunk_list.append(game_table)

那是我失去它的地方。如果每个团队是主队还是客队,如何分别附加?有没有更好的方法来显示这些数据?

一般来说,我想为每场比赛的每支球队增加一个胜负组合。标头应如下所示:

|季节|游戏日期|游戏索引|客队|客场得分|主队|主页得分|优胜者|失败者|客场连胜失去连胜|主队连胜主场连败|

编辑:此错误消息已解决

创建数据框“ season_team_games”时也遇到错误。

TypeError: cannot compare a dtyped [object] array with a scalar of type [bool]

1 个答案:

答案 0 :(得分:0)

您看到的错误来自该语句

season_team_games = season_games[(season_games['Away Team'] == team | season_games['Home Team'] == team)]

添加两个布尔条件时,需要用括号将它们分开。这是因为|运算符优先于==运算符。所以应该变成:

season_team_games = season_games[(season_games['Away Team'] == team) | (season_games['Home Team'] == team)]

我知道问题不仅仅在于此错误,而且如评论中所述,一旦您提供了一些基于文本的数据,可能会更容易获得帮助