我目前有一个看起来像这样的表格:
但我希望它看起来像这样:
当只有一个联赛时,我可以让它看起来像它,但是当有多个联赛时,我不知道该怎么做。如果无法完成,那么我会手动对其进行格式化,但最好在将其保存到文件之前自动进行格式化。
df.drop(df.columns[2], axis=1, inplace=True) # get rid of All column
df.drop(df.columns[3], axis=1, inplace=True) # get rid of KO column
df.drop(df.columns[4], axis=1, inplace=True) # get rid of All.1 column
df.columns = ['League', 'Home', 'Home Team', 'Away Team', 'Away'] # rename columns
df = df.replace(to_replace=np.nan, value='0%') # replace NaN values with 0%
for x in range(len(df.index)):
df.loc[x,'League'] = df.loc[0,'League'] # get league name and copy it to every row in column 0
df = df.drop(df.index[0]) # get rid of top row
这是我当前的代码,用于删除我不想要的列并复制联赛名称,但它仅适用于只有一个联赛而不是多个联赛的情况。
任何解决方案将不胜感激。
答案 0 :(得分:1)
你可以试试:
df1 = df.ffill()
df1 = df1[~df1.eq(df1['Unnamed: 0'], axis='index').all(1)]
df1:
Unnamed: 0 Home All Home Team KO Away Team All.1 Away
1 League 1 100% 50% Team 1 23:00 Team 2 0% 0%
2 League 1 100% 50% Team 3 23:00 Team 4 53% 53%
4 League 2 75% 75% Team 5 20:00 Team 6 29% 29%
6 League 3 50% 75% Team 7 14:00 Team 8 50% 67%
7 League 3 0% 17% Team 9 14:00 Team 10 50% 50%