如何在表中查找多个条目来计算比例?

时间:2018-03-05 21:03:48

标签: python arrays python-3.x

我有很多运动队的表。对于每个团队,我需要知道参加游戏的粉丝数量占该地区所有粉丝的百分比,并且名称后缀相同。下表让您了解我正在使用的内容:

Region      Team                    Suffix   Attending Fans
North West  blue city               city    181
North East  Black and white united  united  130
North West  blue and white city     city    101
North East  Purple United           united  12
North East  red city                city    73
North East  red and white                   112
North West  Red city                city    162
North East  white shorts united     united  93
North East  orange and black city   city    68
North West  pink united             united  4
North West  red united              united  192
North West  orange united           united  42

在上面的例子中,参加红城球迷的比例为所有西北队以“城市”为后缀的球迷的比例为36.48%。

我想知道的是

  1. 如何查找相关元素以便我可以执行计算?
  2. 如何自动执行此操作以便每个团队(包括那些没有后缀的团队)都会出现?

1 个答案:

答案 0 :(得分:0)

这是一种方法。我们的想法是执行groupby.sum()并将其映射到数据框上作为计算的一部分。

import pandas as pd, numpy as np

df = pd.DataFrame([['North West', 'blue city', 'city', 181],
                   ['North East', 'Black and white united', 'united', 130],
                   ['North West', 'blue and white city', 'city', 101],
                   ['North East', 'Purple United', 'united', 12],
                   ['North East', 'red city', 'city', 73],
                   ['North East', 'red and white', '', 112],
                   ['North West', 'Red city', 'city', 162],
                   ['North East', 'white shorts united', 'united', 93],
                   ['North East', 'orange and black city', 'city', 68],
                   ['North West', 'pink united', 'united', 4],
                   ['North West', 'red united', 'united', 192],
                   ['North West', 'orange united', 'united', 42]],
                  columns=['Region', 'Team', 'Suffix', 'Attending Fans'])

g = df.groupby(['Region', 'Suffix'])['Attending Fans'].sum()

df['Pct'] = 100 * df['Attending Fans'] / np.fromiter(map(g.get,
            map(tuple, df[['Region', 'Suffix']].values)), dtype=float)

#         Region                    Team  Suffix  Attending Fans         Pct
# 0   North West               blue city    city             181   40.765766
# 1   North East  Black and white united  united             130   55.319149
# 2   North West     blue and white city    city             101   22.747748
# 3   North East           Purple United  united              12    5.106383
# 4   North East                red city    city              73   51.773050
# 5   North East           red and white                     112  100.000000
# 6   North West                Red city    city             162   36.486486
# 7   North East     white shorts united  united              93   39.574468
# 8   North East   orange and black city    city              68   48.226950
# 9   North West             pink united  united               4    1.680672
# 10  North West              red united  united             192   80.672269
# 11  North West           orange united  united              42   17.647059