使用InStr搜索引号,空格,冒号等

时间:2018-10-05 22:07:49

标签: web-scraping access-vba

这是该问题的延续 scrape data from web page source where url doesn't change

我现在正在尝试搜索抓取的数据,但我无法对其进行正确编码,找不到下面的文本

var render = ["Element1","Element2","Element1","Element1","Element1"] var exhaustive = [] for(var i=0;i<render.length;i++) { for(var j = 0;j<exhaustive.length;j++){ if(!exhaustive[j]===render[i]){ exhaustive.push(render[i]) } } } console.log(exhaustive) // Expected result ["Element1","Element2"]

我试图这样做

<span id="middleContent_lbName_county" style="font-weight:bold;">

我得到的回报是0。

这有效

InStr(.Document.Body.innerHTML,"<span id=" & Chr(34) & "middleContent_lbName_county" & Chr(34) & " style=" & Chr(34) & "font-weight" & Chr(58) & "bold" & Chr(59) & "")

但是它不够独特,我得到的结果太多了。

1 个答案:

答案 0 :(得分:1)

我不清楚,但是有一个ID可以使用,字符串是元素的外部HTML

.document.getElementById("middleContent_lbName_county").outerHTML

其中的信息是:

.document.getElementById("middleContent_lbName_county").innerText

使用.innerText将返回设施名称。

使用您以前的代码:

Option Explicit
Public Sub VisitPages()
    Dim IE As New InternetExplorer
    With IE
        .Visible = True
        .navigate "http://healthapps.state.nj.us/facilities/acSetSearch.aspx?by=county"

        While .Busy Or .readyState < 4: DoEvents: Wend

        With .document
            .querySelector("#middleContent_cbType_5").Click
            .querySelector("#middleContent_cbType_12").Click
            .querySelector("#middleContent_btnGetList").Click
        End With

        While .Busy Or .readyState < 4: DoEvents: Wend

        Dim list As Object, i  As Long
        Set list = .document.querySelectorAll("#main_table [href*=doPostBack]")
        For i = 0 To list.Length - 1
            list.item(i).Click

            While .Busy Or .readyState < 4: DoEvents: Wend

           ' Application.Wait Now + TimeSerial(0, 0, 3) '<== Delete me later. This is just to demo page changes
                Debug.Print .document.getElementById("middleContent_lbName_county").outerHTML
            'do stuff with new page
            .Navigate2 .document.URL             '<== back to homepage
            While .Busy Or .readyState < 4: DoEvents: Wend
            Set list = .document.querySelectorAll("#main_table [href*=doPostBack]") 'reset list (often required in these scenarios)
        Next
        Stop                                     '<== Delete me later
        '.Quit '<== Remember to quit application
    End With
End Sub

一些示例结果:

enter image description here