Question

我创建了一个程序，使用熊猫从Excel文件中删除重复的行。成功执行此操作后，我将熊猫的新数据导出到excel，但是新的excel文件似乎缺少数据（特别是涉及日期的列）。除了显示实际数据外，它仅在行上显示“ ###########”。

代码：

import pandas as pd
data = pd.read_excel('test.xlsx')
data.sort_values("Serial_Nbr", inplace = True)
data.drop_duplicates(subset ="Serial_Nbr", keep = "first", inplace = True)
data.to_excel (r'test_updated.xlsx')

导出前后：

date                            date

2018-07-01                  ##########    
2018-08-01                  ##########    
2018-08-01                  ##########

Answer 1

这意味着单元格的宽度无法显示数据，请尝试扩大单元格的宽度。

单元格的宽度太窄：

扩展单元格的宽度后：

要使用日期时间正确导出到excel，必须添加excel导出的格式代码：

import pandas as pd

data = pd.read_excel('Book1.xlsx')
data.sort_values("date", inplace = False)
data.drop_duplicates(subset ="date", keep = "first", inplace = True)

#Writer datetime format
writer = pd.ExcelWriter("test_updated.xlsx",
                        datetime_format='mm dd yyyy',
                        date_format='mmm dd yyyy')

# Convert the dataframe to an XlsxWriter Excel object.
data.to_excel(writer, sheet_name='Sheet1')
writer.save()

Answer 2

当单元格的宽度太小而无法显示其内容时，将显示

##########。您需要增加单元格的宽度或减少其内容

Answer 3

关于原始数据查询，我同意ALFAFA的答复。
在这里，我正在尝试调整列的大小，以便最终用户不需要在xls中手动执行相同的操作。

步骤如下：

获取列名（根据xls，列名以“ A”，“ B”，“ C”等开头）

colPosn = data.columns.get_loc('col#3')   # Get column position
xlsColName = chr(ord('A')+colPosn)        # Get xls column name (not the column header as per data frame). This will be used to set attributes of xls columns

通过获取列中最长字符串的长度来获取“ col＃3”列的调整大小

maxColWidth = 1 + data['col#3'].map(len).max()  # Gets the length of longest string of the column named 'col#3' (+1 for some buffer space to make data visible in the xls column)

使用column_dimensions [colName] .width属性增加xls列的宽度

data.to_excel(writer, sheet_name='Sheet1', index=False) # use index=False if you dont need the unwanted extra index column in the file
sheet = writer.book['Sheet1']
sheet.column_dimensions[xlsColName].width = maxColWidth # Increase the width of column to match with the longest string in the column
writer.save()

用上述块（以上所有部分）替换ALFAFA帖子的后两行，以调整针对'col＃3'的列宽

将数据帧从熊猫导出到excel时丢失数据

3 个答案: