这是一个来自大型数据框的文章。
ss = {'EventCode': pd.Series(['Goal Away', 'Goal Away', 'Goal Home', 'Goal Away','Goal Home', 'Goal Home', 'Cancel Goal Home', 'Goal Home','Goal Home', 'Goal Away', 'Goal Away', 'Goal Home','Goal Away', 'Goal Home', 'Goal Away', 'Goal Home']),
'Team1_Goal': pd.Series([2,2,2,2,2,0,0,5,5,5,5,5,5,5,5,5]),
'Team2_Goal': pd.Series([3,3,3,3,3,3,0,0,4,4,4,4,4]),
'xG_Team1': pd.Series([1.44344827512893,1.44344827512893,1.44344827512893,1.44344827512893,1.44344827512893,2.665637391386118,2.665637391386118,1.1554900289157282,1.1554900289157282,1.1554900289157282,1.1554900289157282,1.1554900289157282,1.1554900289157282,1.1554900289157282,1.1554900289157282,1.1554900289157282]),
'xG_Team2': pd.Series([1.5713173919057721,1.5713173919057721,1.5713173919057721,1.5713173919057721,1.5713173919057721,0.5207680077479664,0.5207680077479664,1.7456786951765073,1.7456786951765073,1.7456786951765073,1.7456786951765073,1.7456786951765073,1.7456786951765073,1.7456786951765073,1.7456786951765073,1.7456786951765073]),
'new_col1': pd.Series([0,0,179,0,190,123,0,29,75,0,0,118,0,143,0,190]),
'new_col2':pd.Series([100,163,0,181,0,0,0,0,0,97,112,0,140,0,186,0])}
df = pd.DataFrame(ss)
我有一个从xG_Team1和xG_Team2(配对)获取单个值的函数。这很有效。
x1 = [1,0,0]
x2 = [0,1,0]
x3 = [0,0,1]
# Constants
total_timeslot = 180
m = 1
k = 180
Home_Goal = [] # No Goal
Away_Goal = [] # No Goal
def sum_squared_diff(x1, x2, x3, y):
ssd = []
for k in range(total_timeslot): # k will take multiple values
if k in Home_Goal:
ssd.append(sum((x2 - y) ** 2))
elif k in Away_Goal:
ssd.append(sum((x3 - y) ** 2))
else:
ssd.append(sum((x1 - y) ** 2))
return ssd
def my_function(row):
xG_Team1 = row.xG_Team1
xG_Team2 = row.xG_Team2
return np.array([1-(xG_Team1*m + xG_Team2*m)/k, xG_Team1*m/k, xG_Team2*m/k])
results = df.apply(lambda row: sum_squared_diff(x1, x2, x3, my_function(row)), axis=1)
results
问题是上述功能仅在Home和Away_Goal为零或空列表时有效。
我想分别从new_col1
和new_col2
为同一配对xG_Team1
和xG_Team2
分配主页和离开目标的值。 xG_Team1 = 1.44344827512893
and xG_Team2 = 1.5713173919057721
用于上述功能。
例如Home_goal =[179, 190], Away_Goal = [100, 163, 181]
,
dict
非常感谢任何帮助
答案 0 :(得分:1)
你可以这样做:
SockJS reconnect
结果是包含df['new_col'] = df['new_col1'] + df['new_col2']
result = df.groupby(['xG_Team1','xG_Team2','EventCode'])['new_col'].apply(list).reset_index()
列的新数据框,其中包含new_col
,Goal Away
每Goal Home
输出:
xG_Team