我有超过300个包含单词表的单词文档,我一直在尝试为excel编写一个VBA脚本来提取我需要的信息,而且我是Visual Basic的新手。我需要将文件名复制到第一个单元格,以下单元格包含我要提取的信息,然后是下一个文件名,循环开启直到搜索并提取所有单词文档。我尝试了多种不同的方法,但我能找到的最接近的代码如下。它可以提取部件号,但不能描述。它还提取了不需要的无关信息,但如果它是必要的危险,我可以解决这些信息。 我有一个示例word文件(用其他信息替换敏感信息),但我不知道如何附加word文档的第1页和第2页的word文档或jpegs。我知道如果你能看到它会是有益的,所以请告诉我如何在这里或你这么做,这样你就可以看到它。
所以重新迭代:
这是我一直试图操纵的代码。我找到了它,它是一个表格的第一行和最后一行,我试图让它工作,为了我的目的无济于事:
Sub GetTablesFromWord()
'this Excel file must be in
'the same folder with the Word
'document files that are to be'processed.
Dim wApp As Word.Application
Dim wDoc As Word.Document
Dim wTable As Word.Table
Dim wCell As Word.Cell
Dim basicPath As String
Dim fName As String
Dim myWS As Worksheet
Dim xlCell As Range
Dim lastRow As Long
Dim rCount As Long
Dim cCount As Long
Dim RLC As Long
Dim CLC As Long
basicPath = ThisWorkbook.Path & Application.PathSeparator
'change the sheet name as required
Set myWS = ThisWorkbook.Worksheets("Sheet1")
'clear any/all previous data on the sheet myWS.Cells.Clear
'"open" Word Set wApp = CreateObject("Word.Application")
'get first .doc file name in the folder
'with this Excel file
fName = Dir(basicPath & "*.doc*")
Do While fName <> ""
'this puts the filename into column A to
'help separate the table data in Excel
myWS.Range("A" & Rows.Count).End(xlUp).Offset(1, 0) = _
"FILE: [" & fName & "]"
'open the Word file
wApp.Documents.Open basicPath & fName
Set wDoc = wApp.Documents(1)
'if there is a table in the
'Word Document, work with it
If wDoc.Tables.Count > 0 Then
Set wTable = wDoc.Tables(3)
rCount = wTable.Rows.Count
cCount = wTable.Columns.Count
For RLC = 1 To rCount
lastRow = myWS.Range("A" & Rows.Count).End(xlUp).Row + 1
For CLC = 1 To cCount
'if there are merged cells in the
'Word table, an error will be
'generated - ignore the error,
'but also won't process the data
On Error Resume Next
Set wCell = wTable.Cell(RLC, CLC)
If Err <> 0 Then
Err.Clear
Else
If CLC = 1 Then
Set xlCell = myWS.Range("A" & lastRow)
xlCell = wCell
Else
Set xlCell = myWS.Range("B" & lastRow)
xlCell = wCell
End If
End If
On Error GoTo 0
Next
Next
Set wCell = Nothing
Set wTable = Nothing
End If ' end of wDoc.Tables.Count test
wDoc.Close False
Set wDoc = Nothing
fName = Dir()
' gets next .doc* filename in the folder
Loop wApp.Quit
Set wApp = Nothing
MsgBox "Task Completed"
End Sub
答案 0 :(得分:0)
此代码循环遍历文件夹中包含的所有.docx文件,将数据提取到电子表格中,关闭word文档,然后移动到下一个文档。 Word文档的名称被提取到A列中,文档中第3个表中的值被提取到B列中。这应该是您构建的良好起点。
Sub wordScrape()
Dim wrdDoc As Object, objFiles As Object, fso As Object, wordApp As Object
Dim sh1 As Worksheet
Dim x As Integer
FolderName = "C:\code" ' Change this to the folder containing your word documents
Set sh1 = ThisWorkbook.Sheets(1)
Set fso = CreateObject("Scripting.FileSystemObject")
Set wordApp = CreateObject("Word.application")
Set objFiles = fso.GetFolder(FolderName).Files
x = 1
For Each wd In objFiles
If InStr(wd, ".docx") And InStr(wd, "~") = 0 Then
Set wrdDoc = wordApp.Documents.Open(wd.Path, ReadOnly = True)
sh1.Cells(x, 1) = wd.Name
sh1.Cells(x, 2) = Application.WorksheetFunction.Clean(wrdDoc.Tables(3).Cell(Row:=3, Column:=2).Range)
'sh1.Cells(x, 3) = ....more extracted data....
x = x + 1
wrdDoc.Close
End If
Next wd
wordApp.Quit
End Sub