我有一个简单的代码,其中读取excel .xlsx文件作为数据框,然后使用 to_pickle 将其写为pickle文件。随着新的Excel文件的到来,我几个月来一直在使用相同的代码进行读写。但是,这一次,当我尝试代码时,它给出了 TypeError:由于某种原因无法序列化'_io.BufferedReader'对象错误。这是代码,
# Path to .xlsx
MasterItem = MonthlyFolder + "MasterItem__Nov2019.xlsx"
# Function to read the excel file
def ReadExcel(filename, sheetname=None, header=0):
from openpyxl import load_workbook
wb = load_workbook(filename, read_only=True)
if sheetname is None: # If sheetname is not provided then grab the first sheet
print("\t Reading " + wb.sheetnames[0])
ws = wb[wb.sheetnames[0]]
else:
print("\t Reading " + sheetname)
ws = wb[sheetname]
data = ws.values
if header is None:
columns = None
elif header > 0:
# Skip non header rows
for i in range(0, header):
next(data)
# Save header row
columns = next(data)[0:]
else:
columns = next(data)[0:]
# Create a DataFrame based on the subsequent lines of data
df_Out = pd.DataFrame(data, columns=columns)
return df_Out
# Reading .xlsx and writing as pickle
RawMasterItem = ReadExcel(MasterItem)
pd.to_pickle(RawMasterItem, MonthlyFolder+"RawMasterItem.pkl") # This fails to run
以下是我得到的输出和错误,
../Data/2019Nov/MasterItem__Nov2019.xlsx
Reading Sheet1
Traceback (most recent call last):
File "C:\Users\Eulhaq\AppData\Local\conda\conda\envs\DataScience\lib\site-packages\IPython\core\interactiveshell.py", line 3326, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-10-07041bb51f98>", line 3, in <module>
pd.to_pickle(RawMasterItem, MonthlyFolder+"RawMasterItem.pkl")
File "C:\Users\Eulhaq\AppData\Local\conda\conda\envs\DataScience\lib\site-packages\pandas\io\pickle.py", line 76, in to_pickle
f.write(pickle.dumps(obj, protocol=protocol))
TypeError: cannot serialize '_io.BufferedReader' object
调试之后,我意识到openpyxl正在读取并以 <ReadOnlyCell 'Sheet1'.D2
> 返回一些空白单元格。不知道为什么会这样。我已经检查了excel文件,这些位置没有隐藏的字符。知道为什么openpyxl无法像读取空白一样读取某些单元格吗?