我昨天问了这个问题,但不清楚几件事,所以我在此重新发布。基本上,我有一个13列,超过500行的数据框,并且我试图每x行数添加一个标题。
我是一个初学者,所以我尝试了.concat和.append,但是我不确定我是否真的做对了
我的变量标头= ['Rk','Player','Age',...]
In: print(final.head())
out:
index Player Age Tm Pos GP G A P +/- PPP TOI
0 0 Nikita Kucherov 25 TBL RW 82 41 87 128 24 41 19:58
1 4 Brad Marchand 30 BOS LW 79 36 64 100 15 33 19:37
2 5 Sidney Crosby 31 PIT C 79 35 65 100 18 20 21:00
3 6 Nathan MacKinnon 23 COL C 82 41 58 99 20 31 22:05
4 7 Johnny Gaudreau 25 CGY LW 82 36 63 99 18 29 20:04
我想每48行打印一次标题,如果我想每2行打印一次,它看起来像这样:
In: print(final.head())
out:
index Player Age Tm Pos GP G A P +/- PPP TOI
0 0 Nikita Kucherov 25 TBL RW 82 41 87 128 24 41 19:58
1 4 Brad Marchand 30 BOS LW 79 36 64 100 15 33 19:37
Player Age Tm Pos GP G A P +/- PPP TOI
2 5 Sidney Crosby 31 PIT C 79 35 65 100 18 20 21:00
3 6 Nathan MacKinnon 23 COL C 82 41 58 99 20 31 22:05
Player Age Tm Pos GP G A P +/- PPP TOI
4 7 Johnny Gaudreau 25 CGY LW 82 36 63 99 18 29 20:04
请注意,当我多次插入时,我不太在意标题行的索引列的值是什么,我对此部分宽容。
答案 0 :(得分:1)
有可能,但是如果以后需要处理数据,则不建议这样做,因为如果将数字值与字符串混合在一起,则某些功能将失败:
N = 2
#N = 48 with real data
#get index of added values, omit first value
idx = df.index[::N][1:]
#repeat columns to DataFrame
arr = np.broadcast_to(df.columns, (len(idx),len(df.columns)))
df1 = pd.DataFrame(arr, index=idx, columns=df.columns)
#append original and sorting by index
df = df1.append(df).sort_index().reset_index(drop=True)
print (df)
index Player Age Tm Pos GP G A P +/- PPP TOI
0 0 Nikita Kucherov 25 TBL RW 82 41 87 128 24 41 19:58
1 4 Brad Marchand 30 BOS LW 79 36 64 100 15 33 19:37
2 index Player Age Tm Pos GP G A P +/- PPP TOI
3 5 Sidney Crosby 31 PIT C 79 35 65 100 18 20 21:00
4 6 Nathan MacKinnon 23 COL C 82 41 58 99 20 31 22:05
5 index Player Age Tm Pos GP G A P +/- PPP TOI
6 7 Johnny Gaudreau 25 CGY LW 82 36 63 99 18 29 20:04
EDIT要将每个拆分的DataFrame写入一个excel文件中的单独工作表,请使用:
N = 2
#N = 48 with real data
with pd.ExcelWriter('file.xlsx') as writer:
for i, df1 in enumerate(np.split(df, range(N, len(df), N))):
df1.to_excel(writer, sheet_name=f'Sheet{i}', index=False)
EDIT1:用于将所有DataFrame写入相同的工作表名称:
#https://stackoverflow.com/a/33004253 + added index=False to df.to_excel
def multiple_dfs(df_list, sheets, file_name, spaces):
writer = pd.ExcelWriter(file_name,engine='xlsxwriter')
row = 0
for dataframe in df_list:
dataframe.to_excel(writer,sheet_name=sheets,startrow=row ,startcol=0, index=False)
row = row + len(dataframe.index) + spaces + 1
writer.save()
N = 2
#N = 48 with real data
dfs = np.split(df, range(N, len(df), N))
multiple_dfs(dfs, 'Steetname1', 'file.xlsx', 1)