我正在尝试打开为我的项目提供给我的excel文件,该excel文件是我们从SAP系统获得的文件。但是,当我尝试使用let string = 'aa this bb cc that abcd';
function longWord(str) {
return str.split(' ').filter(x => x.length === 4).join(' ');
}
console.log(longWord(string));
打开它时,出现以下错误:
XLRDError:不支持的格式,或文件损坏:预期的BOF记录;找到了'\ xff \ xfe \ r \ x00 \ n \ x00 \ r \ x00'
以下是我的代码:
pandas
答案 0 :(得分:1)
一旦对我有用,不知道它是否对您有用,但是无论如何,您可以尝试以下方法:
from __future__ import unicode_literals
from xlwt import Workbook
import io
filename = r'myexcel.xls'
# Opening the file using 'utf-16' encoding
file1 = io.open(filename, "r", encoding="utf-16")
data = file1.readlines()
# Creating a workbook object
xldoc = Workbook()
# Adding a sheet to the workbook object
sheet = xldoc.add_sheet("Sheet1", cell_overwrite_ok=True)
# Iterating and saving the data to sheet
for i, row in enumerate(data):
# Two things are done here
# Removeing the '\n' which comes while reading the file using io.open
# Getting the values after splitting using '\t'
for j, val in enumerate(row.replace('\n', '').split('\t')):
sheet.write(i, j, val)
# Saving the file as an excel file
xldoc.save('myexcel.xls')
答案 1 :(得分:0)
我曾经遇到同样的xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected BOF record;
错误,并通过将XML写入XLSX转换器解决了该错误。您可以在转换后致电pd.ExcelFile('myexcel.xlsx')
。原因是实际上pandas使用xlrd读取Excel文件,而xlrd不支持XML Spreadsheet(* .xml),即不是XLS或XLSX格式。
import pandas as pd
from bs4 import BeautifulSoup
def convert_to_xlsx():
with open('sample.xls') as xml_file:
soup = BeautifulSoup(xml_file.read(), 'xml')
writer = pd.ExcelWriter('sample.xlsx')
for sheet in soup.findAll('Worksheet'):
sheet_as_list = []
for row in sheet.findAll('Row'):
sheet_as_list.append([cell.Data.text if cell.Data else '' for cell in row.findAll('Cell')])
pd.DataFrame(sheet_as_list).to_excel(writer, sheet_name=sheet.attrs['ss:Name'], index=False, header=False)
writer.save()
答案 2 :(得分:0)
对我有用的是应用此建议:
您还可以找到适合我的合适解释。它说问题是文件格式没有正确保存。当我打开xls文件时,它提出将其另存为html。我将其保存为“ .xlsx”并解决了问题