数据框看起来像
让我们说这个df1
teamname player.1 player.2 player.3
xyz abc nan def
gh1 nan hgf jnr
oed jeo nan nan
输出应该像
让我们说这个df2
teamname player
xyz abc
def
gh1 hgf
jnr
oed jeo
答案 0 :(得分:0)
player_cols = [col for col in df1.columns if 'player' in col.lower()] #Your player column names
df_parts = [] # List to store mini-dfs
for col in player_cols:
df_auxiliary = df1[['teamname', col]]
df_auxiliary = df_auxiliary.rename(columns={col:'Players'})
df_auxiliary = df_auxiliary.dropna()
df_parts.append(df_axuliary)
df2 = pd.concat(df_parts) # Create final df
或在“一行”中:
df2 = pd.wide_to_long(df1, stubnames='player', i=['teamname'], j='player_num')
df2 = df2.dropna()
答案 1 :(得分:0)
我会选择melt()
,这很通用:
teamname player.1 player.2 player.3
0 xyz abc NaN def
1 gh1 NaN hgf jnr
2 oed jeo NaN NaN
导致
df.melt(id_vars=['teamname'], value_name='player').dropna().drop('variable', axis=1).sort_values(['teamname'], ascending=False).set_index('teamname')
player
teamname
xyz abc
xyz def
oed jeo
gh1 hgf
gh1 jnr
熔化后的部分将删除NaN,删除我们不需要的列并对数据框进行排序。最后,我们将teamname
设置为索引。