计算 lambda 熊猫的一行

时间:2021-05-27 07:45:46

标签: python pandas dataframe

我有这个数据框

    Unnamed: 0  Datetime    HomeTeam    AwayTeam    Ball PossessionMatch_H  Ball PossessionMatch_A
0   0   2021-05-24 02:30:00 U. De Chile Everton                         68      32
1   1   2021-05-23 21:00:00 Huachipato  Colo Colo                       48      52
2   2   2021-05-23 18:30:00 Melipilla   Antofagasta                     47      53
3   3   2021-05-23 02:30:00 U. Espanola U. Catolica                     37      63
4   4   2021-05-23 00:00:00 S. Wanderers    O'Higgins                   29      71
... ... ... ... ... ... ...
57  57  2021-03-28 15:45:00 Palestino   Antofagasta                     58      42
58  58  2021-03-28 01:00:00 U. Espanola S. Wanderers                    50      50
59  59  2021-03-27 22:30:00 Colo Colo   Union La Calera                 58      42
60  60  2021-03-27 20:00:00 Everton O'Higgins                           54      46
61  61  2021-03-27 15:00:00 Curico Unido    Melipilla                   41      59

我想将其拆分为多个数据帧并在“HomeTeam”和“AwayTeam”中应用两个标准,然后计算 Ball Possession 的平均值并将其放入新列“Ball PossessionMatch_H/MP”(如果球队在“ HomeTeam”和“Ball PossessionMatch_A/MP”(如果球队在“AwayTeam”中)

代码:

hometeam_count = df.groupby("HomeTeam")["Ball PossessionMatch_H"].count()
hometeam_sum = df.groupby("HomeTeam")["Ball PossessionMatch_H"].sum()
awayteam_count = df.groupby("AwayTeam")["Ball PossessionMatch_A"].count()
awayteam_sum = df.groupby("AwayTeam")["Ball PossessionMatch_A"].sum()

df["Ball PossessionMatch_H/MP"] = df["HomeTeam"].apply(lambda x: ((hometeam_sum.loc[x] if x in hometeam_sum.index else 0) + (awayteam_sum.loc[x] if x in awayteam_sum.index else 0)) / ((hometeam_count.loc[x] if x in hometeam_count.index else 0)  + (awayteam_count.loc[x] if x in awayteam_count.index else 0)))
df["Ball PossessionMatch_A/MP"] = df["AwayTeam"].apply(lambda x: ((hometeam_sum.loc[x] if x in hometeam_sum.index else 0) + (awayteam_sum.loc[x] if x in awayteam_sum.index else 0)) / ((hometeam_count.loc[x] if x in hometeam_count.index else 0)  + (awayteam_count.loc[x] if x in awayteam_count.index else 0)))

解决方案:

hometeam_count_ = df.groupby("HomeTeam").apply(lambda x: x.iloc[1:, :]["Ball PossessionMatch_H"].count())
hometeam_sum_ = df.groupby("HomeTeam").apply(lambda x: x.iloc[1:, :]["Ball PossessionMatch_H"].sum())
awayteam_count_ = df.groupby("AwayTeam").apply(lambda x: x.iloc[:, :]["Ball PossessionMatch_A"].count())
awayteam_sum_ = df.groupby("AwayTeam").apply(lambda x: x.iloc[:, :]["Ball PossessionMatch_A"].sum())

hometeam_count__ = df.groupby("HomeTeam").apply(lambda x: x.iloc[:, :]["Ball PossessionMatch_H"].count())
hometeam_sum__ = df.groupby("HomeTeam").apply(lambda x: x.iloc[:, :]["Ball PossessionMatch_H"].sum())
awayteam_count__ = df.groupby("AwayTeam").apply(lambda x: x.iloc[1:, :]["Ball PossessionMatch_A"].count())
awayteam_sum__ = df.groupby("AwayTeam").apply(lambda x: x.iloc[1:, :]["Ball PossessionMatch_A"].sum())


df["Ball PossessionMatch_H/MP"] = df["HomeTeam"].apply(lambda x: ((hometeam_sum_.loc[x] if x in hometeam_sum_.index else 0) + (awayteam_sum_.loc[x] if x in awayteam_sum_.index else 0)) / ((hometeam_count_.loc[x] if x in hometeam_count_.index else 0)  + (awayteam_count_.loc[x] if x in awayteam_count_.index else 0)))
df["Ball PossessionMatch_A/MP"] = df["AwayTeam"].apply(lambda x: ((hometeam_sum__.loc[x] if x in hometeam_sum__.index else 0) + (awayteam_sum__.loc[x] if x in awayteam_sum__.index else 0)) / ((hometeam_count__.loc[x] if x in hometeam_count__.index else 0)  + (awayteam_count__.loc[x] if x in awayteam_count__.index else 0)))


1 个答案:

答案 0 :(得分:1)

也许这就是您要找的?

hometeam_count = df.groupby("HomeTeam").apply(
    lambda x: x.iloc[1:, :]["Ball PossessionMatch_H"].count()
)
相关问题