VBA数据提取

时间:2018-10-11 17:56:32

标签: html excel vba web-scraping

请帮我写一个VBA,以获取提取的whatsapp联系人编号

以下是URL

https://www.justdial.com/Ahmedabad/Kalon-Laser-Skin-And-Slimming-Across-Shell-Petrol-Pump-Prahladnagar/079PXX79-XX79-171119143016-Q7S9_BZDET?xid=QWhtZWRhYmFkIEJlYXV0eSBQYXJsb3Vycw

有一个隐藏的whatsapp联系人,我想提取此联系人号码。

下面是我的代码。有一些错误

Public Sub GetValueFromBrowser()


Dim Sn As Integer
Dim ie As Object
Dim url As String
Dim Doc As HTMLDocument


url = "https://www.justdial.com/Ahmedabad/Kalon-Laser-Skin-And-Slimming-Across-Shell-Petrol-Pump-Prahladnagar/079PXX79-XX79-171119143016-Q7S9_BZDET?xid=QWhtZWRhYmFkIEJlYXV0eSBQYXJsb3Vycw"


Set ie = CreateObject("InternetExplorer.Application")

With ie
  .Visible = 0
  .navigate url
   While .Busy Or .readyState <> 4
     DoEvents
   Wend
End With

Set Doc = ie.document

Range("C13") = Trim(Doc.getElementsByID("whatsapptriggeer")(0).Value)

结束子

enter image description here

1 个答案:

答案 0 :(得分:1)

您可以从元素的html中解析它。有一个按钮元素,其隐藏的电话号码用作用户标识。您可以使用CSS选择器来检索该元素:

enter image description here

然后拉出外部HTML:

<BUTTON onclick="chkphtvc('079PXX79.XX79.171119143016.Q7S9','https://catalog.justdial.com/mcatalog/index.php?type=website&amp;v=1&amp;l=1&amp;c=1&amp;city=Ahmedabad&amp;company_name=Kalon+Laser+Skin+And+Slimming&amp;docid=079PXX79.XX79.171119143016.Q7S9&amp;userid=9429907546&amp;ps=6&amp;vcode=&amp;m=1&amp;pincode=380015','');" class="jbtn fltrt">Submit</BUTTON>

然后解析该数字即可。您可以选择添加+91前缀。

Option Explicit

Public Sub GetTelNumber()
    Dim sResponse As String, html As HTMLDocument

    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", "https://www.justdial.com/Ahmedabad/Kalon-Laser-Skin-And-Slimming-Across-Shell-Petrol-Pump-Prahladnagar/079PXX79-XX79-171119143016-Q7S9_BZDET?xid=QWhtZWRhYmFkIEJlYXV0eSBQYXJsb3Vycw", False
        .setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
        .send
        sResponse = StrConv(.responseBody, vbUnicode)
    End With
    Set html = New HTMLDocument

    With html
        .body.innerHTML = sResponse
        Debug.Print Split(Split(.querySelector(".jbtn.fltrt[onclick*=userid]").outerHTML, "userid=")(1), "&")(0)
    End With

End Sub

对于您的其他链接,由于我无法访问您显示的ID,因此从理论上讲,这是您可以通过其ID获取元素,然后提取href并解析出电话号码的方法:

Debug.Print Split(doc.getElementById("whatsapptriggeer").href, "phone=")(1)