我几年前在您的网站上发现了一些可爱的代码,它为我提供了导出到Excel的特定表格,行和列信息,并且效果很好。 (感谢原始海报)。
但是,我现在被要求抓住另一个表格单元格,而且这个单元格中有几个段落,它们已经自动编号为单元格中的列表(或另一个,子弹指向)。我并不总是知道列表中有多少项,但我需要完整的单元格内容。
我遇到的问题是,当数据通过编码导出到Excel时,它会丢失编号,并且回车,并且基本上所有都会一起运行而不会中断前一行的数据。
例如 -
成为出口:
P& ID 111222DWG 111-5456DOC512BC-1234
有人可以建议如何调整代码以阻止数据一起运行吗?我很乐意将数据放在一个Excel单元格中,如果是这样的话,我会很高兴。
提前致谢,温迪
Sub wordScrape()
Dim wrdDoc As Object
Dim objFiles As Object
Dim fso As Object
Dim wordApp As Object
Dim sh1 As Worksheet
Dim x As Integer
' Change this to the folder containing your word documents
FolderName = "Y:\120\TEST"
Set sh1 = ThisWorkbook.Sheets(1)
Set fso = CreateObject("Scripting.FileSystemObject")
Set wordApp = CreateObject("Word.application")
Set objFiles = fso.GetFolder(FolderName).Files
x = 1
For Each wd In objFiles
If InStr(wd, ".docx") And InStr(wd, "~") = 0 Then
Set wrdDoc = wordApp.Documents.Open(wd.Path, ReadOnly = True)
'word document file name
sh1.Cells(x, 1) = wd.Name
'document number - Table 1, Row 2, Column 1
sh1.Cells(x, 2) = Application.WorksheetFunction.Clean(wrdDoc.Tables(1).Cell(Row:=2, Column:=1).Range)
'document title - Table 1, Row 3, Column 1
sh1.Cells(x, 3) = Application.WorksheetFunction.Clean(wrdDoc.Tables(1).Cell(Row:=3, Column:=1).Range)
'cell for tags for document - Table 1, Row 9, Column 2
' note - if more than 1 line, and automatic numbering in WORD doc, when exported, will remove numbering and line breaks - runs everything together
sh1.Cells(x, 4) = Application.WorksheetFunction.Clean(wrdDoc.Tables(1).Cell(Row:=9, Column:=2).Range)
'cell that notes frequency for doc - Table 1, Row 16, Column 2
sh1.Cells(x, 5) = Application.WorksheetFunction.Clean(wrdDoc.Tables(1).Cell(Row:=16, Column:=2).Range)
'sh1.Cells(x, 3) = ....more extracted data....
x = x + 1
wrdDoc.Close
End If
Next wd
wordApp.Quit
End Sub
答案 0 :(得分:2)
这将从Word表格单元格中提取常规或项目符号文本,格式化为在Excel单元格中使用。
它会添加"子弹"或者如果单词中的文本被格式化为列表,则为数字(但请注意,如果单元格具有混合格式,则数字将关闭)
'get the text from a table cell
Function CellContent(wdCell) As String
Dim s As String, i As Long, pc As Long, p As Object
pc = wdCell.Range.Paragraphs.Count
'loop over paragraphs in cell (could just be 1)
For i = 1 To pc
s = s & IIf(i > 1, Chr(10), "") 'line break if not first para
Set p = wdCell.Range.Paragraphs(i)
'any list format applied ?
Select Case p.Range.listformat.listtype
Case 2: s = s & "* " 'bullet
Case 3: s = s & i & ". " 'numbered
End Select
s = s & p.Range.Text
Next i
CellContent = Left(s, Len(s) - 1) 'trim off end-of-cell mark from Word
End Function
以下是您从当前Sub中调用它的方式:
sh1.Cells(x, 4) = CellContent( wrdDoc.Tables(1).Cell(9, 2) )