我想用VBA(Excel)解析一长串本地网页(.HTM
文件)并将一些数据提取到excel中。该计划需要抓取超过9000个网页。这是一个例子:
> C:\Users\User_ID\Webpages\BS_1000.HTM.htm
C:\Users\User_ID\Webpages\BS_1001.HTM.htm
C:\Users\User_ID\Webpages\BS_1002.HTM.htm
C:\Users\User_ID\Webpages\BS_1003.HTM.htm
C:\Users\User_ID\Webpages\BS_1006.HTM.htm
C:\Users\User_ID\Webpages\BS_1007.HTM.htm
C:\Users\User_ID\Webpages\BS_1011.HTM.htm
C:\Users\User_ID\Webpages\BS_1012.HTM.htm
C:\Users\User_ID\Webpages\BS_1015.HTM.htm
C:\Users\User_ID\Webpages\BS_1016.HTM.htm
[... and the list goes on ...]
这是VBA:
<!-- language: lang-HTML -->
For startNumber = 1 To TotalProfiles
Dim ie As InternetExplorerMedium
Set ie = New InternetExplorerMedium
ie.Visible = True
Application.StatusBar = "Loading profile " & ProfileNumber & " from a total of " & TotalProfiles & " profiles"
Set currentProfile = Worksheets("List_of_Files").Range("B" & CurrentRowPosition)
ie.navigate currentProfile
Application.StatusBar = "Loading profile: " & ProfileNumber & "; file location: " & currentProfile
Do While ie.READYSTATE <> READYSTATE_COMPLET
DoEvents
Loop
Application.StatusBar = "Storing " & currentProfile & " information into HTMLElement"
Set html = ie.document
Set ie = Nothing
[some code here...]
问题是我当前的代码在新的IE窗口中打开每个页面(不关闭前一个)。有超过9000个网页需要抓取,这很快就会成为一个非常大的问题。
我在Microsoft Office 2013中使用Internet Explorer 11(在Windows 7 Enterprise SP1上)。
我想要的是IE应该在同一个标签页面中打开每个网页(在完成解析并加载下一页后,或者至少关闭窗口时,几乎只需刷新已经使用过的#34;正在使用&#34;标签完成解析后,在&#34; new&#34;窗口中打开下一个网页。可悲的是,到目前为止,我还没有找到解决办法。任何帮助将不胜感激。
答案 0 :(得分:2)
每次打开一个新窗口的原因是您在循环开始时使用此行Set ie = New InternetExplorerMedium
有两种方法可以解决它。
像这样:
Dim ie As InternetExplorerMedium
Set ie = New InternetExplorerMedium
ie.Visible = True
For startNumber = 1 To TotalProfiles
Application.StatusBar = "Loading profile: " & ProfileNumber & "; file location: " & currentProfile
Do While ie.READYSTATE <> READYSTATE_COMPLET
DoEvents
Loop
Set currentProfile = Worksheets("List_of_Files").Range("B" & CurrentRowPosition)
ie.navigate currentProfile
Application.StatusBar = "Storing " & currentProfile & " information into HTMLElement"
Set html = ie.document
[some code here...]
Next
Set html = Nothing
ie.Quit
Set ie = Nothing
像这样:
For startNumber = 1 To TotalProfiles
Dim ie As InternetExplorerMedium
Set ie = New InternetExplorerMedium
ie.Visible = True
Application.StatusBar = "Loading profile " & ProfileNumber & " from a total of " & TotalProfiles & " profiles"
Set currentProfile = Worksheets("List_of_Files").Range("B" & CurrentRowPosition)
ie.navigate currentProfile
Application.StatusBar = "Loading profile: " & ProfileNumber & "; file location: " & currentProfile
Do While ie.READYSTATE <> READYSTATE_COMPLET
DoEvents
Loop
Application.StatusBar = "Storing " & currentProfile & " information into HTMLElement"
Set html = ie.document
[some code here...]
Set html = Nothing
ie.Quit
Set ie = Nothing
Next