我被要求生成一些Excel报告。我目前正在为我的数据大量使用pandas,所以我当然希望使用pandas.ExcelWriter方法来生成这些报告。但是固定的列宽是个问题。
到目前为止我的代码很简单。假设我有一个名为'df'的数据框:
writer = pd.ExcelWriter(excel_file_path)
df.to_excel(writer, sheet_name="Summary")
我正在查看pandas代码,我没有看到任何设置列宽的选项。宇宙中有没有一个技巧可以使列自动调整到数据?或者我可以在xlsx文件之后做些什么来调整列宽?
(我正在使用OpenPyXL库,并生成.xlsx文件 - 如果这有任何区别的话。)
谢谢。
答案 0 :(得分:25)
受user6178746's answer的启发,我有以下内容:
# Given a dict of dataframes, for example:
# dfs = {'gadgets': df_gadgets, 'widgets': df_widgets}
writer = pd.ExcelWriter(filename, engine='xlsxwriter')
for sheetname, df in dfs.items(): # loop through `dict` of dataframes
df.to_excel(writer, sheet_name=sheetname) # send df to writer
worksheet = writer.sheets[sheetname] # pull worksheet object
for idx, col in enumerate(df): # loop through all columns
series = df[col]
max_len = max((
series.astype(str).map(len).max(), # len of largest item
len(str(series.name)) # len of column name/header
)) + 1 # adding a little extra space
worksheet.set_column(idx, idx, max_len) # set column width
writer.save()
答案 1 :(得分:19)
现在可能没有自动方法,但是当你使用openpyxl时,以下行(改编自Bufke上用户how to do in manually的另一个答案)允许你指定一个合理的值(字符宽度):
writer.sheets['Summary'].column_dimensions['A'].width = 15
答案 2 :(得分:17)
我发布这个是因为我遇到了同样的问题,发现Xlsxwriter和pandas的官方文档仍然将此功能列为不受支持。我一起解决了解决我遇到的问题的解决方案。我基本上只是遍历每一列并使用worksheet.set_column来设置列宽= =该列内容的最大长度。
然而,一个重要的注意事项。此解决方案不适合列标题,只适用于列值。如果您需要更换标题,那应该是一个简单的更改。希望这有助于某人:)import pandas as pd
import sqlalchemy as sa
import urllib
read_server = 'serverName'
read_database = 'databaseName'
read_params = urllib.quote_plus("DRIVER={SQL Server};SERVER="+read_server+";DATABASE="+read_database+";TRUSTED_CONNECTION=Yes")
read_engine = sa.create_engine("mssql+pyodbc:///?odbc_connect=%s" % read_params)
#Output some SQL Server data into a dataframe
my_sql_query = """ SELECT * FROM dbo.my_table """
my_dataframe = pd.read_sql_query(my_sql_query,con=read_engine)
#Set destination directory to save excel.
xlsFilepath = r'H:\my_project' + "\\" + 'my_file_name.xlsx'
writer = pd.ExcelWriter(xlsFilepath, engine='xlsxwriter')
#Write excel to file using pandas to_excel
my_dataframe.to_excel(writer, startrow = 1, sheet_name='Sheet1', index=False)
#Indicate workbook and worksheet for formatting
workbook = writer.book
worksheet = writer.sheets['Sheet1']
#Iterate through each column and set the width == the max length in that column. A padding length of 2 is also added.
for i, col in enumerate(my_dataframe.columns):
# find length of column i
column_len = my_dataframe[col].astype(str).str.len().max()
# Setting the length if the column header is larger
# than the max column value length
column_len = max(column_len, len(col)) + 2
# set the column length
worksheet.set_column(i, i, column_len)
writer.save()
答案 3 :(得分:5)
动态调整所有列的长度
writer = pd.ExcelWriter('/path/to/output/file.xlsx')
df.to_excel(writer, sheet_name='sheetName', index=False, na_rep='NaN')
for column in df:
column_length = max(df[column].astype(str).map(len).max(), len(column))
col_idx = df.columns.get_loc(column)
writer.sheets['sheetName'].set_column(col_idx, col_idx, column_length)
使用列名手动调整列
col_idx = df.columns.get_loc('columnName')
writer.sheets['sheetName'].set_column(col_idx, col_idx, 15)
使用列索引手动调整列
writer.sheets['sheetName'].set_column(col_idx, col_idx, 15)
如果以上任何一项均失败
AttributeError: 'Worksheet' object has no attribute 'set_column'
确保安装xlsxwriter
:
pip install xlsxwriter
答案 4 :(得分:3)
通过使用pandas和xlsxwriter,您可以完成任务,下面的代码将在Python 3.x中完美地工作。有关使用XlsxWriter和熊猫的更多详细信息,此链接可能很有用https://xlsxwriter.readthedocs.io/working_with_pandas.html
import pandas as pd
writer = pd.ExcelWriter(excel_file_path, engine='xlsxwriter')
df.to_excel(writer, sheet_name="Summary")
workbook = writer.book
worksheet = writer.sheets["Summary"]
#set the column width as per your requirement
worksheet.set_column('A:A', 25)
writer.save()
答案 5 :(得分:2)
在工作中,我总是将数据帧写入excel文件。因此,我没有反复编写相同的代码,而是创建了一个模数。现在,我只是将其导入并使用它来编写和设置excel文件。但是有一个缺点,如果数据帧过大,则需要花费很长时间。 所以这是代码:
def result_to_excel(output_name, dataframes_list, sheet_names_list, output_dir):
out_path = os.path.join(output_dir, output_name)
writerReport = pd.ExcelWriter(out_path, engine='xlsxwriter',
datetime_format='yyyymmdd', date_format='yyyymmdd')
workbook = writerReport.book
# loop through the list of dataframes to save every dataframe into a new sheet in the excel file
for index, dataframe in enumerate(dataframes_list):
sheet_name = sheet_names_list[index] # choose the sheet name from sheet_names_list
dataframe.to_excel(writerReport, sheet_name=sheet_name, index=False, startrow=0)
# Add a header format.
format = workbook.add_format({
'bold': True,
'border': 1,
'fg_color': '#0000FF',
'font_color': 'white'})
# Write the column headers with the defined format.
worksheet = writerReport.sheets[sheet_name]
for col_num, col_name in enumerate(dataframe.columns.values):
worksheet.write(0, col_num, col_name, format)
worksheet.autofilter(0, 0, 0, len(dataframe.columns) - 1)
worksheet.freeze_panes(1, 0)
# loop through the columns in the dataframe to get the width of the column
for index, col in enumerate(dataframe.columns):
max_width = max([len(str(s)) for s in dataframe[col].values] + [len(col) + 2])
# define a max width to not get to wide column
if max_width > 50:
max_width = 50
worksheet.set_column(index, index, max_width)
writerReport.save()
writerReport.close()
return output_dir + output_name
答案 6 :(得分:1)
我发现,基于列标题而不是列内容来调整列更有用。
使用df.columns.values.tolist()
生成列标题的列表,并使用这些标题的长度来确定列的宽度。
查看下面的完整代码:
import pandas as pd
import xlsxwriter
writer = pd.ExcelWriter(filename, engine='xlsxwriter')
df.to_excel(writer, index=False, sheet_name=sheetname)
workbook = writer.book # Access the workbook
worksheet= writer.sheets[sheetname] # Access the Worksheet
header_list = df.columns.values.tolist() # Generate list of headers
for i in range(0, len(header_list)):
worksheet.set_column(i, i, len(header_list[i])) # Set column widths based on len(header)
writer.save() # Save the excel file
答案 7 :(得分:1)
def auto_width_columns(df, sheetname):
workbook = writer.book
worksheet= writer.sheets[sheetname]
for i, col in enumerate(df.columns):
column_len = max(df[col].astype(str).str.len().max(), len(col) + 2)
worksheet.set_column(i, i, column_len)
答案 8 :(得分:1)
这个功能对我有用,也修复了索引宽度
def write_to_excel(writer, X, sheet_name, sep_only=False):
#writer=writer object
#X=dataframe
#sheet_name=name of sheet
#sep_only=True:write only as separate excel file, False: write as sheet to the writer object
if sheet_name=="":
print("specify sheet_name!")
else:
X.to_excel(f"{output_folder}{prefix_excel_save}_{sheet_name}.xlsx")
if not sep_only:
X.to_excel(writer, sheet_name=sheet_name)
#fix column widths
worksheet = writer.sheets[sheet_name] # pull worksheet object
for idx, col in enumerate(X.columns): # loop through all columns
series = X[col]
max_len = max((
series.astype(str).map(len).max(), # len of largest item
len(str(series.name)) # len of column name/header
)) + 1 # adding a little extra space
worksheet.set_column(idx+1, idx+1, max_len) # set column width (=1 because index = 1)
#fix index width
max_len=pd.Series(X.index.values).astype(str).map(len).max()+1
worksheet.set_column(0, 0, max_len)
if sep_only:
print(f'{sheet_name} is written as seperate file')
else:
print(f'{sheet_name} is written as seperate file')
print(f'{sheet_name} is written as sheet')
return writer
调用示例:
writer = write_to_excel(writer, dataframe, "Statistical_Analysis")
答案 9 :(得分:0)
最简单的解决方案是在set_column方法中指定列宽。
for worksheet in writer.sheets.values():
worksheet.set_column(0,last_column_value, required_width_constant)
答案 10 :(得分:0)
结合其他答案和评论,还支持多索引:
def autosize_excel_columns(worksheet, df):
autosize_excel_columns_df(worksheet, df.index.to_frame())
autosize_excel_columns_df(worksheet, df, offset=df.index.nlevels)
def autosize_excel_columns_df(worksheet, df, offset=0):
for idx, col in enumerate(df):
series = df[col]
max_len = max((
series.astype(str).map(len).max(),
len(str(series.name))
)) + 1
worksheet.set_column(idx+offset, idx+offset, max_len)
sheetname=...
df.to_excel(writer, sheet_name=sheetname, freeze_panes=(df.columns.nlevels, df.index.nlevels))
worksheet = writer.sheets[sheetname]
autosize_excel_columns(worksheet, df)
writer.save()
答案 11 :(得分:0)
import re
import openpyxl
..
for col in _ws.columns:
max_lenght = 0
print(col[0])
col_name = re.findall('\w\d', str(col[0]))
col_name = col_name[0]
col_name = re.findall('\w', str(col_name))[0]
print(col_name)
for cell in col:
try:
if len(str(cell.value)) > max_lenght:
max_lenght = len(cell.value)
except:
pass
adjusted_width = (max_lenght+2)
_ws.column_dimensions[col_name].width = adjusted_width
答案 12 :(得分:0)
是的,您可以在事实之后对xlsx文件进行一些操作以调整列宽。 使用xlwings来autofit列。这是一个非常简单的解决方案,请参见示例代码的最后六行。此过程的优点是您不必担心字体大小,字体类型或其他任何问题。 要求:Excel安装。
import pandas as pd
import xlwings as xw
report_file = "test.xlsx"
df1 = pd.DataFrame([
('this is a long term1', 1, 1, 3),
('this is a long term2', 1, 2, 5),
('this is a long term3', 1, 1, 6),
('this is a long term2', 1, 1, 9),
], columns=['term', 'aaaa', 'bbbbbbb', "cccccccccccccccccccccccccccccccccccccccccccccc"])
writer = pd.ExcelWriter(report_file, engine="xlsxwriter")
df1.to_excel(writer, sheet_name="Sheet1", index=False)
workbook = writer.book
worksheet1 = writer.sheets["Sheet1"]
num_format = workbook.add_format({"num_format": '#,##0.00'})
worksheet1.set_column("B:D", cell_format=num_format)
writer.save()
# Autofit all columns with xlwings.
app = xw.App(visible=False)
wb = xw.Book(report_file)
for ws in wb.sheets:
ws.autofit(axis="columns")
wb.save(report_file)
app.quit()