我目前有一个用于分析体育数据的数据框。一栏“ Team”具有玩家所属的团队,另一栏“ Game Info”具有有关游戏的信息。游戏信息列如下所示
SAC @ HOU 12/09/2019 08:00 PM ET
,并且“团队”列中可以包含“ SAC”或“ HOU”。我正在尝试创建一个包含对手的新列。目前我尝试过的是
df.insert(7, "Opp", '', True)
df["Opp"][df['Game Info'].str[:3].str.contains(df['Team'])] = df['Game Info'].str[4:7]
df["Opp"][df['Opp'].empty] = df['Team']
这给了我以下错误:
'Series' objects are mutable, thus they cannot be hashed
我也尝试过
df['Opp'] = np.where(df['Team'].str != df['Game Info'].str[:3]), df['Game Info'].str[:3], df['Game Info'].str[4:7])
和
df['Opp'] = df['Game Info'].str[:3] if df['Team'].str != df['Game Info'].str[:3] else df['Game Info'].str[4:7]
但都给我以下错误:
The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
我如何能够正确比较这些子字符串?
答案 0 :(得分:1)
使用:
df=pd.DataFrame({'Team':['SAC','HOU'], 'Game Info':['SAC@HOU 12/09/2019 08:00PM ET', 'SAC@HOU 12/09/2019 08:00PM ET']})
df['Opp'] = np.where(df['Team'] == df['Game Info'].str[:3], df['Game Info'].str[4:7], df['Game Info'].str[:3])
df
Team Game Info Opp
0 SAC SAC@HOU 12/09/2019 08:00PM ET HOU
1 HOU SAC@HOU 12/09/2019 08:00PM ET SAC