查询yellowpages.com以返回街道地址

时间:2014-10-10 19:02:00

标签: excel excel-vba yellow-pages vba

我正在尝试从Excel中获取名称和邮政编码列表,一次将一个名称和邮政编码输入到www.yellowpages.com的搜索字段中,并按相同的顺序将街道地址结果返回到Excel作为原始名称和邮政编码。没有返回错误消息,它只是停止而没有完成。我不知道它停在哪里,但确实打开了Internet Explorer,输入搜索字词并点击搜索,因为我可以看到.visible = True时。我最好的猜测是在""。

之间

以下是我的代码(改编自DontFretBrettDinesh Kumar Takyar):

Sub Address_Scrape()
    Dim eRow As Long
    Dim ele As Object
    Dim wb As Workbook
    Dim srch As Worksheet
    Dim trgt As Worksheet
    Set wb = ThisWorkbook
    Set srch = wb.Sheets("Master with addresses")
    Set trgt = wb.Sheets("Sheet1")
    Dim url As String
    Dim zc As String
    Dim Name As String

Name = srch.Range("B2")
zc = srch.Range("F2")
url = "URL;http://www.yellowpages.com/"
url = url & "/" & zc & "/" & Name
RowCount = 1
trgt.Range("A" & RowCount) = "Name"
trgt.Range("B" & RowCount) = "Address"
trgt.Range("C" & RowCount) = "City"
trgt.Range("D" & RowCount) = "State"
trgt.Range("E" & RowCount) = "Zip"
eRow = Sheet1.Cells(Rows.Count, 1).End(xlUp).Offset(1, 0).Row
Set objIE = CreateObject("InternetExplorer.Application")
    With objIE
    .navigate "http://www.yellowpages.com/"
    .Visible = True
    Do While .Busy Or _
    .readyState <> 4
    DoEvents
    Loop
Set who = .document.getElementsByName("search_terms")
who.Item(0).Value = Name
Set where = .document.getElementsByName("geo_location_terms")
where.Item(0).Value = zc
.document.forms(0).submit
    Do While .Busy Or _
    .readyState <> 4
    DoEvents
    Loop
"Results = .document.getElementsByTagName("p")(0).innerText"
    For Each ele In .document.all
        Select Case ele.tagName
        Case Results
        RowCount = RowCount + 1
        Case "Name"
        trgt.Range("A" & RowCount) = ele.getElementByclass("business-name").innerText
        Case "Address"
        trgt.Range("B" & RowCount) = ele.getElementByclass("street-address").innerText
        Case "City"
        trgt.Range("C" & RowCount) = Trim(ele.getElementByclass("locality").innerText)
        Case "State"
        trgt.Range("D" & RowCount) = ele.getElementByitemprop("addressRegion").innerText
        Case "Zip"
        trgt.Range("E" & RowCount) = ele.getElementByitemprop("postalCode").innerText
        End Select
    Next ele
Set objIE = Nothing
End With
End Sub

1 个答案:

答案 0 :(得分:0)

您希望基本上从黄页搜索中删除数据。

前段时间我创建了一个有用的Excel加载项来执行此类查找而不诉诸VBA:http://blog.tkacprow.pl/excel-scrape-html-add/

让我们从一开始就开始GET URL结构:

http://www.yellowpages.com/search?search_terms=[SEARCH_TERM]&geo_location_terms=[LOCATION]

[SEARCH_TERM]和[LOCATION]是您的GET参数。

现在,假设使用加载项中的函数,您需要获取具有类名&#34; business-name&#34;的元素的文本。使用此功能:

=GetElementByRegex("http://www.yellowpages.com/search?search_terms=[SEARCH_TERM]&geo_location_terms=[LOCATION]"; "class=""business-name""[^<>]*?>((?:.|\n)*?)<[^<>]*?/")

没有VBA只是正则表达式。只需用您自己的GET参数替换GET参数即可。当然,在不同元素的情况下,正则表达式可能会有所不同 - 但它仍然比从头开始编写VBA更简单。

希望这会有所帮助。