我有一个像这样的pandas数据框:
team W L GF GA date home_ind last10
67 ARI 1 0 3 2 2016-11-01 1 1
99 ARI 1 0 2 2 2016-11-03 1 1
129 ARI 1 0 4 3 2016-10-15 1 1
171 ARI 1 0 5 4 2016-10-27 0 1
241 ARI 0 10 1 5 2016-11-04 0 0
316 ARI 0 10 3 5 2016-10-25 0 1
331 ARI 0 10 2 3 2016-10-21 0 1
334 ARI 0 10 2 3 2016-10-29 1 1
335 ARI 0 10 2 5 2016-10-20 0 1
340 ARI 0 10 4 7 2016-10-18 0 1
341 ARI 0 10 2 3 2016-10-23 0 1
我有30个不同团队的这些信息。
我想要做的是根据其他列的条件,将一列中的值加起来。
例如,我想要一个新的列来添加GF的值,但仅当home_ind = 1且last10 = 1 AND team = ARI时。结果的值与每个团队的列的值相同。因此,对于我列出的示例,结果将如下所示:
team W L GF GA date home_ind last10 GF_H_10
67 ARI 1 0 3 2 2016-11-01 1 1 11
99 ARI 1 0 2 2 2016-11-03 1 1 11
129 ARI 1 0 4 3 2016-10-15 1 1 11
171 ARI 1 0 5 4 2016-10-27 0 1 0
241 ARI 0 10 1 5 2016-11-04 0 0 0
316 ARI 0 10 3 5 2016-10-25 0 1 0
331 ARI 0 10 2 3 2016-10-21 0 1 0
334 ARI 0 10 2 3 2016-10-29 1 1 11
335 ARI 0 10 2 5 2016-10-20 0 1 0
340 ARI 0 10 4 7 2016-10-18 0 1 0
341 ARI 0 10 2 3 2016-10-23 0 1 0
答案 0 :(得分:0)
怎么样:
首先制作一个名为criteria的布尔切片器,然后使用赋值:
criteria = (df['home_ind'] == 1) & (df['last10'] == 1) & (df['team'] == 'ARI')
df.loc[criteria,'GF_H_10'] = df[criteria]['GF'].sum()
给出:
GA GF L W date home_ind last10 team GF_H_10
67 2 3 0 1 2016-11-01 1 1 ARI 11.0000
99 2 2 0 1 2016-11-03 1 1 ARI 11.0000
129 3 4 0 1 2016-10-15 1 1 ARI 11.0000
171 4 5 0 1 2016-10-27 0 1 ARI nan
241 5 1 10 0 2016-11-04 0 0 ARI nan
316 5 3 10 0 2016-10-25 0 1 ARI nan
331 3 2 10 0 2016-10-21 0 1 ARI nan
334 3 2 10 0 2016-10-29 1 1 ARI 11.0000
335 5 2 10 0 2016-10-20 0 1 ARI nan
340 7 4 10 0 2016-10-18 0 1 ARI nan
341 3 2 10 0 2016-10-23 0 1 ARI nan
然后使纳米变为0.0:
df['GF_H_10'].fillna(0.0,inplace=True)
答案 1 :(得分:0)
此处的其他解决方案特定于ARI团队。这会在团队中执行groupby,允许其他30个团队完成操作。我不确定你要追哪。
在团队中执行groupby,然后将结果加入原始数据框是此解决方案背后的主要思想。之后会根据您定义的资格标准进行清理。
import pandas as pd
# sample data
df = pd.DataFrame({'team':['ARI']*11+['BWI']*4,
'W':[1]*4+[0]*7+[1,1,0,0],
'GF':[3,2,4,5,1,3,2,2,2,4,2,2,2,2,2],
'GA':[2,2,3,4,5,5,3,3,5,7,3,1,1,1,1],
'home_ind':[1,1,1,0,0,0,0,1,0,0,0,1,1,0,0],
'last10':[1]*4+[0]+[1]*6+[1,0,1,1]})
# define a mask
df2 = df.assign(elig=(df['home_ind'] == 1) & (df['last10'] == 1))
# group on team and join the results to the original dataframe
df2 = df2.join(df2[df2['elig']].groupby('team')['GF'].sum(), on='team', rsuffix='_H_10')
# clean up the result column
df2.loc[~df2['elig'], 'GF_H_10'] = 0
给定数据框
GA GF W home_ind last10 team
0 2 3 1 1 1 ARI
1 2 2 1 1 1 ARI
2 3 4 1 1 1 ARI
3 4 5 1 0 1 ARI
4 5 1 0 0 0 ARI
5 5 3 0 0 1 ARI
6 3 2 0 0 1 ARI
7 3 2 0 1 1 ARI
8 5 2 0 0 1 ARI
9 7 4 0 0 1 ARI
10 3 2 0 0 1 ARI
11 1 2 1 1 1 BWI
12 1 2 1 1 0 BWI
13 1 2 0 0 1 BWI
14 1 2 0 0 1 BWI
输出
GA GF W home_ind last10 team elig GF_H_10
0 2 3 1 1 1 ARI True 11
1 2 2 1 1 1 ARI True 11
2 3 4 1 1 1 ARI True 11
3 4 5 1 0 1 ARI False 0
4 5 1 0 0 0 ARI False 0
5 5 3 0 0 1 ARI False 0
6 3 2 0 0 1 ARI False 0
7 3 2 0 1 1 ARI True 11
8 5 2 0 0 1 ARI False 0
9 7 4 0 0 1 ARI False 0
10 3 2 0 0 1 ARI False 0
11 1 2 1 1 1 BWI True 2
12 1 2 1 1 0 BWI False 0
13 1 2 0 0 1 BWI False 0
14 1 2 0 0 1 BWI False 0