我创建了一个程序,使用熊猫从Excel文件中删除重复的行。成功执行此操作后,我将熊猫的新数据导出到excel,但是新的excel文件似乎缺少数据(特别是涉及日期的列)。除了显示实际数据外,它仅在行上显示“ ###########”。
代码:
import pandas as pd
data = pd.read_excel('test.xlsx')
data.sort_values("Serial_Nbr", inplace = True)
data.drop_duplicates(subset ="Serial_Nbr", keep = "first", inplace = True)
data.to_excel (r'test_updated.xlsx')
导出前后:
date date
2018-07-01 ##########
2018-08-01 ##########
2018-08-01 ##########
答案 0 :(得分:2)
这意味着单元格的宽度无法显示数据,请尝试扩大单元格的宽度。
单元格的宽度太窄:
扩展单元格的宽度后:
要使用日期时间正确导出到excel,必须添加excel导出的格式代码:
import pandas as pd
data = pd.read_excel('Book1.xlsx')
data.sort_values("date", inplace = False)
data.drop_duplicates(subset ="date", keep = "first", inplace = True)
#Writer datetime format
writer = pd.ExcelWriter("test_updated.xlsx",
datetime_format='mm dd yyyy',
date_format='mmm dd yyyy')
# Convert the dataframe to an XlsxWriter Excel object.
data.to_excel(writer, sheet_name='Sheet1')
writer.save()
答案 1 :(得分:0)
##########
。您需要增加单元格的宽度或减少其内容
答案 2 :(得分:0)
关于原始数据查询,我同意ALFAFA的答复。
在这里,我正在尝试调整列的大小,以便最终用户不需要在xls中手动执行相同的操作。
步骤如下:
colPosn = data.columns.get_loc('col#3') # Get column position xlsColName = chr(ord('A')+colPosn) # Get xls column name (not the column header as per data frame). This will be used to set attributes of xls columns
maxColWidth = 1 + data['col#3'].map(len).max() # Gets the length of longest string of the column named 'col#3' (+1 for some buffer space to make data visible in the xls column)
data.to_excel(writer, sheet_name='Sheet1', index=False) # use index=False if you dont need the unwanted extra index column in the file sheet = writer.book['Sheet1'] sheet.column_dimensions[xlsColName].width = maxColWidth # Increase the width of column to match with the longest string in the column writer.save()