从网站获取数据

时间:2013-10-12 14:50:12

标签: excel vba

<span itemprop="streetAddress">

    **94 Grand St**

</span>

如何通过excel vba中的getelementby方法获取此数据

我尝试过getelementbyid,getelementbyname等,但没有任何工作

Option Explicit

Sub find()
'Uses late binding, or add reference to Microsoft HTML Object Library
'  and change variable Types to use intellisense
Dim ie As Object 'InternetExplorer.Application
Dim html As Object 'HTMLDocument
Dim Listings As Object 'IHTMLElementCollection
Dim l As Object 'IHTMLElement
Dim r As Long
    Set ie = CreateObject("InternetExplorer.Application")
    With ie
        .Visible = False
        .Navigate "http://www.yelp.com/biz/if-boutique-new-york#query:boutique"
        ' Don't show window
        'Wait until IE is done loading page
        Do While .readyState <> 4
            Application.StatusBar = "Downloading information, Please wait..."
            DoEvents
        Loop
        Set html = .Document
    End With
    Set Listings = html.getElementsByTagName("span") ' ## returns the list
    MsgBox (Listings(0))
    For Each l In Listings
        '## make sure this list item looks like the listings Div Class:
        '   then, build the string to put in your cell
        Range("A1").Offset(r, 0).Value = l.innerText
            r = r + 1
    Next

Set html = Nothing
Set ie = Nothing
End Sub

我使用上面的程序来获取span标记内的innerText值...但它不起作用

1 个答案:

答案 0 :(得分:1)

对于您要详细查找的单个结果,您希望在代码中使用这两行(在详细级别只有1个列表)

调整您的IE代码

  Set Listings = html.getElementbyid("bizInfoBody") ' ## returns the list
  Range("A1").Offset(r, 0).Value = Listings.innerText

使用XMLHTTP

Sub GetTxt()
Dim objXmlHTTP As Object
Dim objHtmlDoc As Object
Dim objHtmlBody As Object
Dim objTbl As Object

Dim strResponse As String
Dim strSite As String


Set objHtmlDoc = CreateObject("htmlfile")
Set objHtmlBody = objHtmlDoc.body

Set objXmlHTTP = CreateObject("MSXML2.XMLHTTP")
strSite = "http://www.yelp.com/biz/if-boutique-new-york"

With objXmlHTTP
    .Open "GET", strSite, False
    .Send
    If .Status = 200 Then
    strResponse = .responseText
    objHtmlBody.innerHTML = objXmlHTTP.responseText
    Set objTbl = objHtmlBody.Document.getElementbyid("bizInfoBody")
    MsgBox objTbl.innerText
    End If
End With

End Sub