我已经编写了代码,但即使是第一页也无法运行。我的目标是从每个页面提取以下建立详细信息为例:
Column 1: 103 West Lounge (Food Service Inspections)
Column 2: 103 WEST PACES FERRY RD ATLANTA, GA 30318
(Skip this detail) View inspections:
Column 3: July 10, 2012 Score: 92, Grade: A
Column 4): July 26, 2013 Score: 90, Grade: A
Column 5): February 19, 2014 Score: 98, Grade: A
Column 6): December 12, 2014 Score: 100, Grade: A
Column 6): November 13, 2015 Score: 99, Grade: A
目前,该代码仅从中提取URL而没有任何详细信息,需要查看要更改或错误的内容:
Sub Test()
Dim IE As New InternetExplorer
Dim html As HTMLDocument
Dim link As Object
Dim ws As Worksheet
Set ws = Sheets("Sheet1")
Application.ScreenUpdating = False
Set IE = New InternetExplorer
' Test 2 pages (page 2 and page 3) starting from page 2. So far so good.
For i = 2 To 4 Step 2
myurl = "http://ga.healthinspections.us/georgia/search.cfm?start=" & i & "1&1=1&f=s&r=ANY&s=&inspectionType=Food&sd=03/26/2016&ed=04/25/2016&useDate=NO&county=Fulton&"
IE.Visible = False
IE.navigate myurl
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Set html = IE.document
' I assume here is the problem, because I need to supplement code part to find these details.
Set link = html.getElementsByTagName("a")
' This part was intended to test if I can to extract at least one detail.
For m = 1 To 2
For Each myurl In link
Cells(m, 1) = link
Next
Next m
Next i
'Also I tried to test with msgbox but no luck either
'MsgBox link
IE.quit
Set IE = Nothing
Application.StatusBar = ""
Application.ScreenUpdating = True
End Sub
也许某些事情搞砸了,或者我只是缺乏知识。 :)希望得到任何帮助。
答案 0 :(得分:0)
你有参考设定吗?用于Microsoft Internet控件和Microsoft HTML对象库?如果是这样,请尝试替换代码部分。
Dim IE As New InternetExplorer
Dim html As MSHTML.HTMLDocument
Dim link As Object
Dim ws As Worksheet
Set ws = Sheets("Sheet1")
Application.ScreenUpdating = False
Set IE = New InternetExplorer
答案 1 :(得分:0)
您可以使用以下方法获取innertext。
Sub DumpData()
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
URL = "http://ga.healthinspections.us/georgia/search.cfm?start=1&1=1&f=s&r=ANY&s=&inspectionType=Food&sd=03/26/2016&ed=04/25/2016&useDate=NO&county=Fulton&"
'Wait for site to fully load
IE.Navigate2 URL
Do While IE.Busy = True
DoEvents
Loop
RowCount = 1
With Sheets("Sheet1")
.Cells.ClearContents
RowCount = 1
For Each itm In IE.Document.all
.Range("A" & RowCount) = itm.tagName
.Range("B" & RowCount) = itm.ID
.Range("C" & RowCount) = itm.className
.Range("D" & RowCount) = Left(itm.innerText, 1024)
RowCount = RowCount + 1
Next itm
End With
End Sub
我是从一个名叫乔尔的好人那里得到的。他是这样的人。
将数据导入工作表后,进行一些简单的清理工作,摆脱多余的东西,你就应该全力以赴。