根据列值拆分数据框并导出到不同的Excel工作表

时间:2018-03-21 14:16:45

标签: python excel python-3.x pandas

之前使用的来源:

Pandas: Iterate through a list of DataFrames and export each to excel sheets

Splitting dataframe into multiple dataframes

我设法完成了所有这些:

# sort the dataframe
df.sort(columns=['name'], inplace=True)
# set the index to be this and don't drop
df.set_index(keys=['name'], drop=False,inplace=True)
# get a list of names
names=df['name'].unique().tolist()
# now we can perform a lookup on a 'view' of the dataframe
joe = df.loc[df.name=='joe']
# now you can query all 'joes'

我已经成功完成了这项工作 - joe = df.loc[df.name=='joe']并且它给出了我所寻找的确切结果。

作为使其适用于大量数据的解决方案,我发现了这种潜在的解决方案。

writer = pandas.ExcelWriter("MyData.xlsx", engine='xlsxwriter')
List = [Data , ByBrand]
for i in List:
        i.to_excel(writer, sheet_name= i)
writer.save()

目前我有:

teacher_names = ['Teacher A', 'Teacher B', 'Teacher C']

df =

              ID   Teacher_name      Student_name
Teacher_name                                                         
Teacher A     1.0  Teacher A         Student 1 
Teacher A     NaN  Teacher A         Student 2  
Teacher B     0.0  Teacher B         Student 3 
Teacher C     2.0  Teacher C         Student 4 

如果我使用 - test = df.loc[df.Teacher_name=='Teacher A'] - 将收到确切的结果。

问题:如何优化它会自动将“测试”结果保存到(对于每个教师单独)excel文件(.to_excel(writer, sheet_name=Teacher_name)与教师姓名,并将为所有现有的数据库老师?

2 个答案:

答案 0 :(得分:0)

这对你有用。你几乎就在那里,你只需要迭代names列表并每次都过滤你的数据帧。

names = df['name'].unique().tolist()

writer = pandas.ExcelWriter("MyData.xlsx", engine='xlsxwriter')

for myname in names:
    mydf = df.loc[df.name==myname]
    mydf.to_excel(writer, sheetname=myname)

writer.save()

答案 1 :(得分:0)

@jpp,将文本“ sheetname”替换为“ sheet_name”。同样,一旦将“名称”变量转换为列表,在运行for循环以基于唯一名称值创建多个工作表时,我将收到以下错误:

InvalidWorksheetName: Invalid Excel character '[]:*?/\' in sheetname '['.

基于列值(通过函数)创建多个工作表(在单个excel文件中)的替代方法:

def writesheet(g):
    a=g['name'].tolist()[0]
    g.to_excel(writer,sheet_name = str(a),index=False)

df.groupby('name').apply(writesheet)
writer.save()

来源:How to split a large excel file into multiple worksheets based on their given ip address using pandas python