Question

我正在尝试打开为我的项目提供给我的excel文件，该excel文件是我们从SAP系统获得的文件。但是，当我尝试使用let string = 'aa this bb cc that abcd'; function longWord(str) { return str.split(' ').filter(x => x.length === 4).join(' '); } console.log(longWord(string));打开它时，出现以下错误：

XLRDError：不支持的格式，或文件损坏：预期的BOF记录；找到了'\ xff \ xfe \ r \ x00 \ n \ x00 \ r \ x00'

以下是我的代码：

pandas

Answer 1

一旦对我有用，不知道它是否对您有用，但是无论如何，您可以尝试以下方法：

from __future__ import unicode_literals
from xlwt import Workbook
import io

filename = r'myexcel.xls'
# Opening the file using 'utf-16' encoding
file1 = io.open(filename, "r", encoding="utf-16")
data = file1.readlines()

# Creating a workbook object
xldoc = Workbook()
# Adding a sheet to the workbook object
sheet = xldoc.add_sheet("Sheet1", cell_overwrite_ok=True)
# Iterating and saving the data to sheet
for i, row in enumerate(data):
    # Two things are done here
    # Removeing the '\n' which comes while reading the file using io.open
    # Getting the values after splitting using '\t'
    for j, val in enumerate(row.replace('\n', '').split('\t')):
        sheet.write(i, j, val)

# Saving the file as an excel file
xldoc.save('myexcel.xls')

Answer 2

我曾经遇到同样的xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected BOF record;错误，并通过将XML写入XLSX转换器解决了该错误。您可以在转换后致电pd.ExcelFile('myexcel.xlsx')。原因是实际上pandas使用xlrd读取Excel文件，而xlrd不支持XML Spreadsheet（* .xml），即不是XLS或XLSX格式。

import pandas as pd
from bs4 import BeautifulSoup

def convert_to_xlsx():
    with open('sample.xls') as xml_file:
        soup = BeautifulSoup(xml_file.read(), 'xml')
        writer = pd.ExcelWriter('sample.xlsx')
        for sheet in soup.findAll('Worksheet'):
            sheet_as_list = []
            for row in sheet.findAll('Row'):
                sheet_as_list.append([cell.Data.text if cell.Data else '' for cell in row.findAll('Cell')])
            pd.DataFrame(sheet_as_list).to_excel(writer, sheet_name=sheet.attrs['ss:Name'], index=False, header=False)

        writer.save()

Answer 3

对我有用的是应用此建议：

How to cope with an XLRDError

您还可以找到适合我的合适解释。它说问题是文件格式没有正确保存。当我打开xls文件时，它提出将其另存为html。我将其保存为“ .xlsx”并解决了问题

Python-XLRDError：不支持的格式，或文件损坏：预期的BOF记录

3 个答案: