无法在vba中以正确的方式使用querySelector

时间:2017-08-20 11:56:03

标签: vba web-scraping

我使用vba编写了一些代码,以便从torrent网站中获取特定网页上的所有电影名称。但是,按“F8”我可以发现代码运行良好并打印结果,直到它从该页面到达最后一个结果。一旦到达要解析的姓氏,程序就会崩溃。我做了好几次并且遭受了同样的后果。如果vba不支持这个css选择器方法,那我怎么能在最后一个之前收集结果呢?在执行之前是否有任何参考添加到库或其他内容?对此有任何帮助将非常感激。

这是我写的代码:

Sub Torrent_data()

    Dim http As New XMLHTTP60, html As New HTMLDocument
    Dim movie_name As Object, movie As Object

    With http
        .Open "GET", "https://www.yify-torrent.org/search/1080p/", False
        .send
        html.body.innerHTML = .responseText
    End With

    Set movie_name = html.querySelectorAll("div.mv h3 a")

    For Each movie In movie_name
        x = x + 1: Cells(x, 1) = movie.innerText
    Next movie

End Sub

4 个答案:

答案 0 :(得分:2)

试试这个:

Sub Torrent_data()

    Dim http As New XMLHTTP60, html As New HTMLDocument, x As Long

    With http
        .Open "GET", "https://www.yify-torrent.org/search/1080p/", False
        .send
        html.body.innerHTML = .responseText
    End With

    Do
    x = x + 1
    On Error Resume Next
    Cells(x, 1) = html.querySelectorAll("div.mv h3 a")(x - 1).innerText
    Loop Until Err.Number = 91

End Sub

答案 1 :(得分:2)

代码检索最后一部电影之后的一个元素

此额外元素导致失败,因此无法使用my_div

不确定为什么......但......会更新

for each ...

答案 2 :(得分:1)

看起来像querySelectorAll有某种问题

无法在Watch窗口中检查对象html.querySelectorAll(".mv h3 a")

试图这样做会崩溃excel或word(我试过两个)

尝试了其他标签,结果相同

Sub Torrent_data()

    Dim http As New XMLHTTP60, html As New HTMLDocument
    Dim movie_name As Object, movie As Object

    With http
        .Open "GET", "https://www.yify-torrent.org/search/1080p/", False
        .send
        html.body.innerHTML = .responseText
    End With

'   Set movie_name = html.querySelectorAll("div.mv h3 a")   ' querySelectorAll crashes VBA when trying to examine movie_name object

    Set movie_name = html.getElementsByClassName("mv")      ' HTMLElementCollection

    For Each movie In movie_name
        x = x + 1: Cells(x, 1) = movie.getElementsByTagName("a")(1).innerText
    Next movie

'   HTML block for each movie looks like this

'   <div class="mv">
'       <h3>
'           <a href='/movie/55346/download-smoke-1995-1080p-mp4-yify-torrent.html' target="_blank" title="Smoke (1995) 1080p">Smoke (1995) 1080p</a>
'       </h3>
'       <div class="movie">
'           <div class="movie-image">
'               <a href="/movie/55346/download-smoke-1995-1080p-mp4-yify-torrent.html" target="_blank" title="Download Smoke (1995) 1080p">
'                   <span class="play"><span class="name">Smoke (1995) 1080p</span></span>
'                   <img src="//pic.yify-torrent.org/20170820/55346/smoke-1995-1080p-poster.jpg" alt="Smoke (1995) 1080p" />
'               </a>
'           </div>
'       </div>
'       <div class="mdif">
'           <ul>
'               <li><b>Genre:</b>Comedy</li><li><b>Quality:</b>1080p</li><li><b>Screen:</b>1920x1040</li><li><b>Size:</b>2.14G</li><li><b>Rating:</b>7.4/10</li><li><b>Peers:</b>2</li><li><b>Seeds:</b>0</li>
'           </ul>
'           <a href="/movie/55346/download-smoke-1995-1080p-mp4-yify-torrent.html" class="small button orange" target="_blank" title="Download Smoke (1995) 1080p YIFY Torrent">Download</a>
'       </div>
'   </div>

End Sub

答案 3 :(得分:0)

我知道这很老,但是我设法在不使我的IE崩溃的情况下使用querySelectorAll。

我使用 Function typeinspection(Source As String) As String Dim Rst As Recordset Dim Rst2 As Recordset Dim s As String s = "" Set Rst = CurrentDb.OpenRecordset("tbldata") Set Rst2 = CurrentDb.OpenRecordset("qryunpair") While Not Rst.EOF If InStr(Source, Rst2.Fields("UnpairEquipment") > 0) Then _ s = Rst.Fields("Last Inspection") Rst.MoveNext Wend Set Rst = Nothing Set Rst2 = Nothing typeinspection = s End Function

,而不是使用 For-each

以下示例:

For Loop