如何将原始数据转换为更容易计算的内容
info.teams
['Australia', 'Sri Lanka']
['Australia', 'Sri Lanka']
['Australia', 'Sri Lanka']
['India', 'West Indies']
['India', 'West Indies']
['Bangladesh', 'West Indies']
['Australia', 'Sri Lanka']
['Bangladesh', 'India']
['Australia', 'Sri Lanka']
['India', 'West Indies']
['India', 'South Africa']
['Afghanistan', 'India']
['India', 'South Africa']
['Australia', 'Sri Lanka']
['India', 'Sri Lanka']
['India', 'Sri Lanka']
['India', 'Sri Lanka']
['Australia', 'Sri Lanka']
['Bangladesh', 'West Indies']
['Afghanistan', 'India']
['India', 'South Africa']
['Australia', 'Sri Lanka']
['Australia', 'Sri Lanka']
['Bangladesh', 'West Indies']
['India', 'West Indies']
['Bangladesh', 'West Indies']
['Bangladesh', 'India']
['India', 'South Africa']
这是该列的数据类型。
info.teams 1547 non-null object
假设我想找出一起参加比赛的球队。 ['India','Australia']
我必须编写如下代码:
#choosing particular teams
team_1='India'
team_2='Australia'
team_12='['+"'"+team_1+"'"+', '+"'"+team_2+"'"+']'
team_21='['+"'"+team_2+"'"+', '+"'"+team_1+"'"+']'
df=df[(df['info.teams']==team_12) | (df['info.teams']==team_21)]
答案 0 :(得分:2)
如果数据采用字符串形式,则使用ast.literal_eval
将它们转换为列表,应用pd.Series
然后使用isin选择列,即
import ast
df['teams'] = df['teams'].str.strip().apply(ast.literal_eval)
ndf = df['teams'].apply(pd.Series)
ndf[ndf.isin(['India','Sri Lanka']).all(1)]
0 1
14 India Sri Lanka
15 India Sri Lanka
16 India Sri Lanka
如果要从主数据帧中选择数据,请使用ndf中的索引,即
idx = ndf[ndf.isin(['India','Sri Lanka']).all(1)].index
df.loc[idx]
teams
14 [India, Sri Lanka]
15 [India, Sri Lanka]
16 [India, Sri Lanka]
答案 1 :(得分:-1)
不确定你在找什么。但是如果你想让团队分成两个单独的栏目,那么你可能想要做这样的事情:
info[['team1','team2']]=pd.DataFrame(info.teams.values.tolist())
out put:
teams team1 team2
0 [aus,ind] aus ind