我从外部数据库下载了大量的.xlsx文件,我想使用它们。它有两个工作表,第一个工作表只对数据有一些注释,第二个工作表包含数据。
我尝试使用以下两个选项打开Excel电子表格,但它们都给我一个错误。删除第一个工作表时,错误消失。但由于我有> 350个文件,我不想手动删除所有这些工作表。
我试过的代码
from openpyxl import load_workbook
wb = load_workbook('/Users/user/Desktop/Data_14.xlsx')
出现错误:
InvalidFileException: "There is no item named 'xl/styles.xml' in the archive"
和
from xlrd import open_workbook
book = open_workbook('/Users/user/Desktop/Data_14.xlsx')
,它给出了一个非常长的错误消息(KeyError:9)
我认为问题是第一个Excel工作表中的公式错误。工作表中的一个单元格表示
- minimum percentage that must characterise the path from a subject Company up to its Ultimate owner: 50.01%
但它没有格式化为文本。执行单元格会在Excel中显示错误消息。插入" ' "让它文本让我然后用python打开文件,这是我想要做的。
有关如何自动打开excel文件以解决此问题的任何想法?
答案 0 :(得分:1)
解决方案:
我已将脚本命名为delsheet.py并将其放在也包含excel文件的目录中。
我在使用优胜美地的Mac OS X上。
了解您的版本和设置会很有用,因为根据版本的不同,openpyxl可能会变幻无常。
如果它们完全相同则方便,如果所有第一张纸都被命名为' Sheet1'然后这个脚本将按原样运行,这就是你如何表达问题所以这就是我编写解决方案的方式;如果不同请澄清。感谢。
了解脚本:
首先,脚本存储脚本位置的路径,以便知道从哪个目录调用它,因此找到它。
该脚本会从该位置列出文件扩展名为.xlsx的同一目录中的文件,并将其附加到列表' spreadsheet_list'
使用for循环获取列表中的元素数量' spreadsheet_list'让脚本知道迭代列表中元素的时间。
delsheet.py
#!/usr/bin/env python3
# Using python 3.4.3 and openpyxl 2.3.0
# Remove the first worksheet from a batch of excel sheets
# Import Modules
import sys, os, re
from openpyxl import Workbook, load_workbook
# Create List
spreadsheet_list = []
# Script path to the directory it is located.
pcwd=os.path.dirname(os.path.abspath(__file__))
# List files in directory by file extension
# Specify directory
items = os.listdir(pcwd)
# Specify extension in "if" loop and append the files in the directory to the "spreadsheet_list" list.
for names in items:
if names.endswith(".xlsx"):
spreadsheet_list.append(names)
# Debugging purposes: print out the list of appended excel files in script directory
# print(spreadsheet_list)
# For loop, using the number of elements in the spreadsheet_list we can determine how long the loop should go
for i in range(len(spreadsheet_list)):
# print(i) to see that i is = to the number of excel files located in the directory
# Load workbook into memory (Opening the Excel file automatically...)
wb = load_workbook(spreadsheet_list[int(i)])
## Store Sheet1 in the workbook as 'ws'
ws = wb['Sheet1']
## Remove the worksheet 'ws'
wb.remove_sheet(ws)
## Save the edited excel sheets (with the original name)
wb.save(spreadsheet_list[int(i)])
答案 1 :(得分:0)
请尝试使用此加载项合并所有第二张。
http://www.rondebruin.nl/win/addins/rdbmerge.htm
或者,运行此脚本以删除所有工作簿中的所有第一个工作表。 。 。
Sub Example()
Dim MyPath As String, FilesInPath As String
Dim MyFiles() As String, Fnum As Long
Dim mybook As Workbook
Dim CalcMode As Long
Dim sh As Worksheet
Dim ErrorYes As Boolean
Application.DisplayAlerts = False
'Fill in the path\folder where the files are
MyPath = "C:\Users\rshuell001\Desktop\excel\"
'Add a slash at the end if the user forget it
If Right(MyPath, 1) <> "\" Then
MyPath = MyPath & "\"
End If
'If there are no Excel files in the folder exit the sub
FilesInPath = Dir(MyPath & "*.xl*")
If FilesInPath = "" Then
MsgBox "No files found"
Exit Sub
End If
'Fill the array(myFiles)with the list of Excel files in the folder
Fnum = 0
Do While FilesInPath <> ""
Fnum = Fnum + 1
ReDim Preserve MyFiles(1 To Fnum)
MyFiles(Fnum) = FilesInPath
FilesInPath = Dir()
Loop
'Change ScreenUpdating, Calculation and EnableEvents
With Application
CalcMode = .Calculation
.Calculation = xlCalculationManual
.ScreenUpdating = False
.EnableEvents = False
End With
'Loop through all files in the array(myFiles)
If Fnum > 0 Then
For Fnum = LBound(MyFiles) To UBound(MyFiles)
Set mybook = Nothing
On Error Resume Next
Set mybook = Workbooks.Open(MyPath & MyFiles(Fnum))
On Error GoTo 0
If Not mybook Is Nothing Then
'Change cell value(s) in one worksheet in mybook
On Error Resume Next
With mybook.Worksheets(1)
ActiveSheet.Delete
End With
If Err.Number > 0 Then
ErrorYes = True
Err.Clear
'Close mybook without saving
mybook.Close savechanges:=False
Else
'Save and close mybook
mybook.Close savechanges:=True
End If
On Error GoTo 0
Else
'Not possible to open the workbook
ErrorYes = True
End If
Next Fnum
End If
If ErrorYes = True Then
MsgBox "There are problems in one or more files, possible problem:" _
& vbNewLine & "protected workbook/sheet or a sheet/range that not exist"
End If
'Restore ScreenUpdating, Calculation and EnableEvents
With Application
.ScreenUpdating = True
.EnableEvents = True
.Calculation = CalcMode
End With
Application.DisplayAlerts = True
End Sub