我正在尝试从Excel中获取名称和邮政编码列表,一次将一个名称和邮政编码输入到www.yellowpages.com的搜索字段中,并按相同的顺序将街道地址结果返回到Excel作为原始名称和邮政编码。没有返回错误消息,它只是停止而没有完成。我不知道它停在哪里,但确实打开了Internet Explorer,输入搜索字词并点击搜索,因为我可以看到.visible = True时。我最好的猜测是在""。
之间以下是我的代码(改编自DontFretBrett和Dinesh Kumar Takyar):
Sub Address_Scrape()
Dim eRow As Long
Dim ele As Object
Dim wb As Workbook
Dim srch As Worksheet
Dim trgt As Worksheet
Set wb = ThisWorkbook
Set srch = wb.Sheets("Master with addresses")
Set trgt = wb.Sheets("Sheet1")
Dim url As String
Dim zc As String
Dim Name As String
Name = srch.Range("B2")
zc = srch.Range("F2")
url = "URL;http://www.yellowpages.com/"
url = url & "/" & zc & "/" & Name
RowCount = 1
trgt.Range("A" & RowCount) = "Name"
trgt.Range("B" & RowCount) = "Address"
trgt.Range("C" & RowCount) = "City"
trgt.Range("D" & RowCount) = "State"
trgt.Range("E" & RowCount) = "Zip"
eRow = Sheet1.Cells(Rows.Count, 1).End(xlUp).Offset(1, 0).Row
Set objIE = CreateObject("InternetExplorer.Application")
With objIE
.navigate "http://www.yellowpages.com/"
.Visible = True
Do While .Busy Or _
.readyState <> 4
DoEvents
Loop
Set who = .document.getElementsByName("search_terms")
who.Item(0).Value = Name
Set where = .document.getElementsByName("geo_location_terms")
where.Item(0).Value = zc
.document.forms(0).submit
Do While .Busy Or _
.readyState <> 4
DoEvents
Loop
"Results = .document.getElementsByTagName("p")(0).innerText"
For Each ele In .document.all
Select Case ele.tagName
Case Results
RowCount = RowCount + 1
Case "Name"
trgt.Range("A" & RowCount) = ele.getElementByclass("business-name").innerText
Case "Address"
trgt.Range("B" & RowCount) = ele.getElementByclass("street-address").innerText
Case "City"
trgt.Range("C" & RowCount) = Trim(ele.getElementByclass("locality").innerText)
Case "State"
trgt.Range("D" & RowCount) = ele.getElementByitemprop("addressRegion").innerText
Case "Zip"
trgt.Range("E" & RowCount) = ele.getElementByitemprop("postalCode").innerText
End Select
Next ele
Set objIE = Nothing
End With
End Sub
答案 0 :(得分:0)
您希望基本上从黄页搜索中删除数据。
前段时间我创建了一个有用的Excel加载项来执行此类查找而不诉诸VBA:http://blog.tkacprow.pl/excel-scrape-html-add/
让我们从一开始就开始GET URL结构:
http://www.yellowpages.com/search?search_terms=[SEARCH_TERM]&geo_location_terms=[LOCATION]
[SEARCH_TERM]和[LOCATION]是您的GET参数。
现在,假设使用加载项中的函数,您需要获取具有类名&#34; business-name&#34;的元素的文本。使用此功能:
=GetElementByRegex("http://www.yellowpages.com/search?search_terms=[SEARCH_TERM]&geo_location_terms=[LOCATION]"; "class=""business-name""[^<>]*?>((?:.|\n)*?)<[^<>]*?/")
没有VBA只是正则表达式。只需用您自己的GET参数替换GET参数即可。当然,在不同元素的情况下,正则表达式可能会有所不同 - 但它仍然比从头开始编写VBA更简单。
希望这会有所帮助。