我正在尝试为特定HTML元素抓取网站并将其转换为Word,同时保持格式不变。我可以先成功将其导入Excel(然后再导入到Word),但单个单元格的字符限制不能让我获取所需的所有数据。这是我当前的代码,从我从网站上获取HTML开始:
Set html = ie.document
Dim objWord As Object
Dim objDoc As Object
Set objWord = CreateObject("Word.Application")
Set objDoc = objWord.Documents.Add
objWord.Visible = True
Sheets("Sheet1").Range("A1").Value = html.getElementById("main_container").innerText
Sheets("Sheet1").Range("A1").Select
Selection.Copy
objDoc.Range.Paste
如果我可以跳过Excel步骤并将HTML正确粘贴到Word中,那就太棒了。我想到的另一个选项是将main_container HMTL的每个子段粘贴到一个单独的单元格中,以避免达到最大字符限制。如果有人可以提供帮助,或者对我如何解决这个问题有其他想法,那将非常感谢!
答案 0 :(得分:2)
改为使用字符串变量!
Dim sInnerText As String, i As Integer, j As Integer
sInnerText = html.getElementById("main_container").innerText
'insert text into word document
objDoc.Range.Text = sInnerText
'or devide it by length of characters to be able to add parts into cells
i = 1
j = 1
Do While j<Len(sInnerText)
ThisWorkbook.Worksheets("Sheet1").Range("A" & i) = Mid(sInnerText, j, j+255)
j = j + 256
i = i + 1
Loop
答案 1 :(得分:1)
也许你正在寻找这样的东西。我使用这2个程序来设置sub和superscript html标签并使用WildCards撤消
Sub SuperSub()
'Sub y Super Indices a formato HTML
With ActiveDocument.Content.Find
.Text = "" 'Subíndice
.Font.Subscript = 1
.Replacement.Text = "<sub>^&</sub>"
.Replacement.Font.Subscript = 0
.Execute Replace:=wdReplaceAll
End With
With ActiveDocument.Content.Find
.Text = "" 'Superíndice
.Font.Superscript = 1
.Replacement.Text = "<sup>^&</sup>"
.Replacement.Font.Superscript = 0
.Execute Replace:=wdReplaceAll
End With
End Sub
Sub InverSuperSub()
'Sub y Super Indices de html a WORD
With ActiveDocument.Content.Find
.Text = "\<sub\>(*)\<\/sub\>" 'Subíndice
.MatchWildcards = True
.Font.Subscript = 0
.Replacement.Text = "\1"
.Replacement.Font.Subscript = 1
.Execute Replace:=wdReplaceAll
End With
With ActiveDocument.Content.Find
.Text = "\<sup\>(*)\<\/sup\>" 'Superíndice
.MatchWildcards = True
.Font.Superscript = 0
.Replacement.Text = "\1"
.Replacement.Font.Superscript = 1
.Execute Replace:=wdReplaceAll
End With
End Sub
我希望它会帮助你。