我想从网页中提取信息。我有以下例子。
<div class="class1">
<div class="class2">Address</div>
address1<br>
address2<br>
Zipcode, City, Country <br>
<div class="class2">phone</div>
+1 352 555 555 <br>
<div class="class2">Mobile</div >
0563 555 31 56
<div class="class2">Email</div>
email@provider.com<br>
</div>
我想获得以下内容:
我有以下代码:
Sub GetData()
Dim oHtml As HTMLDocument
Dim oElement As Object
Dim i As Integer
Set oHtml = New HTMLDocument
With CreateObject("WINHTTP.WinHTTPRequest.5.1")
.Open "GET", "http://www.example.com", False
.send
oHtml.body.innerHTML = .responseText
End With
i = 1
For Each oElement In oHtml.getElementsByClassName("class1")
Debug.Print i, oElement.outerText
i = i + 1
Next oElement
End Sub
感谢任何帮助!
谢谢你,蒂娜
答案 0 :(得分:0)
快速又脏,但有效:
Sub GetData()
Dim oHtml As HTMLDocument
Dim resultText As String
Set oHtml = New HTMLDocument
With CreateObject("WINHTTP.WinHTTPRequest.5.1")
.Open "GET", "http://www.example.com", False
.send
oHtml.body.innerHTML = .responseText
End With
'get inner text of div
resultText = oHtml.getElementsByClassName("class1")(0).innerText
'delete headers
resultText = Replace(resultText, "Address", "")
resultText = Replace(resultText, "phone", "")
resultText = Replace(resultText, "Mobile", "")
resultText = Replace(resultText, "Email", "")
'delete linebreaks
resultText = Replace(resultText, vbCrLf & vbCrLf, vbCrLf)
resultText = Replace(resultText, vbCrLf & vbCrLf, vbCrLf)
Debug.Print resultText
End Sub