嗨,我有多个xlsx文件
sales-feb-2014.xlsx
sales-jan-2014.xlsx
sales-mar-2014.xlsx
我已使用文件名INDEX [0]将所有3张纸合并为一个数据集
脚本:
import pandas as pd
import numpy as np
import glob
import os
all_data = pd.DataFrame()
for f in glob.glob(r'H:\Learning\files\sales*.xlsx'):
df = pd.read_excel(f)
df['filename'] = os.path.basename(f)
df = df.reset_index().set_index('filename')
print(df)
现在数据如下:
file name col1 col2 col3
sales-jan-2014.xlsx .... .... ...
sales-feb-2014.xlsx .... .... ...
sales-mar-2014.xlsx .... .... ...
我要在此处加载新的xlsx文件
sales-jan-2014.xlsx into sheet1
sales-feb-2014.xlsx into sheet2
sales-mar-2014.xlsx into sheet3
我已尝试使用此脚本:
writer = pd.ExcelWriter('output.xlsx')
for filename in df.index.get_level_values(0).unique():
temp_df = df.xs(filename, level=0)
temp_df.to_excel(writer,filename)
writer.save()
执行此脚本后出现错误:
loc,new_ax = labels.get_loc_level(key,level = level, AttributeError:“索引”对象没有属性“ get_loc_level”
您能建议我想念的地方吗
答案 0 :(得分:0)
尝试使用以下代码:
import os
import pandas as pd
dirpath = "C:\\Users\\Path\\TO\\Your XLS folder\\data\\"
fileNames = os.listdir(dirpath)
writer = pd.ExcelWriter(dirpath+'combined.xlsx', engine='xlsxwriter')
for fname in fileNames:
df = pd.read_excel(dirpath+fname)
print(df)
df.to_excel(writer, sheet_name=fname)
writer.save()
您还可以通过以下更改来使用代码:
for f in glob.glob(r'H:\Learning\files\sales*.xlsx'):
df = pd.read_excel(f)
df['filename'] = os.path.basename(f)
df = df.reset_index()
print(df.columns)
df.set_index(['filename','index'], inplace=True)
并保存为已保存的
我希望这对您有帮助