使用 vba 从网站抓取数据不起作用

时间:2021-08-01 13:19:13

标签: vba web web-scraping mshtml

我想创建一个抓取网站的玩家姓名列表。 Internet Explorer 启动,但出现运行时错误“438” - 对象不支持此属性或方法。

网页结构

Structure of webpage

我的编码如下:

Option Explicit

Sub Kickbase()

Dim IE As New SHDocVw.InternetExplorer
Dim HTMLdoc As MSHTML.HTMLDocument
Dim HTMLPlayers As MSHTML.IHTMLElementCollection
Dim HTMLPlayer As MSHTML.IHTMLElement
Dim i As Integer
Dim HTMLfirstName As Object
Dim firstName As String


IE.Visible = True
IE.Navigate "https://play.kickbase.com/transfermarkt/kaufen"

Do While IE.ReadyState <> READYSTATE_COMPLETE
Loop

Application.Wait (Now + TimeValue("0:00:10"))

Set HTMLdoc = IE.Document

Set HTMLPlayers = HTMLdoc.getElementsByClassName("players")

For i = 0 To HTMLPlayers(0).getElementsByClassName("firstName").Length - 1

Set HTMLfirstName = HTMLPlayers(0).getElementsByClassName("firstName")

   If Not HTMLfirstName Is Nothing Then
    firstName = Trim(HTMLfirstName.innerText)
    
   Else
     firstName = "no_value"
   End If

Debug.Print firstName

Next i

End Sub

我已激活以下库: enter image description here

1 个答案:

答案 0 :(得分:0)

由于我无法自行测试网站,下面的代码可能不是最好的方法,但应该可以:

Sub Kickbase()

    Dim IE As New SHDocVw.InternetExplorer
    Dim HTMLdoc As MSHTML.HTMLDocument
    Dim HTMLPlayers As Object
    Dim i As Integer
    Dim firstName As String
        
    IE.Visible = True
    IE.navigate "https://play.kickbase.com/transfermarkt/kaufen"
    
    Do While IE.readyState <> READYSTATE_COMPLETE
        DoEvents
    Loop
    
    Application.Wait (Now + TimeValue("0:00:10"))
    
    Set HTMLdoc = IE.document
    Set HTMLPlayers = HTMLdoc.getElementsByClassName("playerName")
            
    For i = 0 To HTMLPlayers(0).getElementsByClassName("firstName").Length - 1
        
        firstName = Trim$(HTMLPlayers(0).getElementsByClassName("firstName")(i).innerText)
        If firstName = vbNullString Then firstName = "no_value"
            
        Debug.Print firstName
    Next i
    
    '=== Optional depending on your use case, remember to close IE or else it will remain there ===
    'IE.Quit
    'Set IE = Nothing
    
End Sub
相关问题