这是该问题的延续 scrape data from web page source where url doesn't change
我现在正在尝试搜索抓取的数据,但我无法对其进行正确编码,找不到下面的文本
var render = ["Element1","Element2","Element1","Element1","Element1"]
var exhaustive = []
for(var i=0;i<render.length;i++) {
for(var j = 0;j<exhaustive.length;j++){
if(!exhaustive[j]===render[i]){
exhaustive.push(render[i])
}
}
}
console.log(exhaustive) // Expected result ["Element1","Element2"]
我试图这样做
<span id="middleContent_lbName_county" style="font-weight:bold;">
我得到的回报是0。
这有效
InStr(.Document.Body.innerHTML,"<span id=" & Chr(34) & "middleContent_lbName_county" & Chr(34) & " style=" & Chr(34) & "font-weight" & Chr(58) & "bold" & Chr(59) & "")
但是它不够独特,我得到的结果太多了。
答案 0 :(得分:1)
我不清楚,但是有一个ID可以使用,字符串是元素的外部HTML
.document.getElementById("middleContent_lbName_county").outerHTML
其中的信息是:
.document.getElementById("middleContent_lbName_county").innerText
使用.innerText
将返回设施名称。
使用您以前的代码:
Option Explicit
Public Sub VisitPages()
Dim IE As New InternetExplorer
With IE
.Visible = True
.navigate "http://healthapps.state.nj.us/facilities/acSetSearch.aspx?by=county"
While .Busy Or .readyState < 4: DoEvents: Wend
With .document
.querySelector("#middleContent_cbType_5").Click
.querySelector("#middleContent_cbType_12").Click
.querySelector("#middleContent_btnGetList").Click
End With
While .Busy Or .readyState < 4: DoEvents: Wend
Dim list As Object, i As Long
Set list = .document.querySelectorAll("#main_table [href*=doPostBack]")
For i = 0 To list.Length - 1
list.item(i).Click
While .Busy Or .readyState < 4: DoEvents: Wend
' Application.Wait Now + TimeSerial(0, 0, 3) '<== Delete me later. This is just to demo page changes
Debug.Print .document.getElementById("middleContent_lbName_county").outerHTML
'do stuff with new page
.Navigate2 .document.URL '<== back to homepage
While .Busy Or .readyState < 4: DoEvents: Wend
Set list = .document.querySelectorAll("#main_table [href*=doPostBack]") 'reset list (often required in these scenarios)
Next
Stop '<== Delete me later
'.Quit '<== Remember to quit application
End With
End Sub
一些示例结果: