我使用硒创建了一个宏,以从网站获取不同文章的链接,并在导航到目标页面后解析每个文章的标题。我的第一个示例只是按预期完成使用单个子程序编写的操作。
我想使用两个子修改我的宏,并像在第二个示例中尝试的那样在两个子之间重用相同的驱动程序。
工作一个(使用单个子程序):
Sub FetchLinks()
Const link$ = "https://stackoverflow.com/questions/tagged/web-scraping"
Dim driver As New ChromeDriver, post As Object
Dim itmLink As Variant, R&, iDic As Object
Set iDic = CreateObject("Scripting.Dictionary")
With driver
.get link
For Each post In .FindElementsByCss(".summary .question-hyperlink", timeout:=10000)
iDic(post.Attribute("href")) = 1
Next post
For Each itmLink In iDic.keys
driver.get itmLink
Debug.Print .FindElementByCss("h1 > a.question-hyperlink").Text
Next itmLink
End With
End Sub
无法使其正常工作(试图在另一个子程序中传递驱动程序以便重用):
Sub FetchLinks()
Const link$ = "https://stackoverflow.com/questions/tagged/web-scraping"
Dim driver As New ChromeDriver, post As Object
With driver
.get link
For Each post In .FindElementsByCss(".summary .question-hyperlink", timeout:=10000)
FetchData driver, post.Attribute("href")
Next post
End With
End Sub
Sub FetchData(ByRef driver, ByRef nlink As String)
Dim elem As Object
With driver
.get nlink
Debug.Print .FindElementByCss("h1 > a.question-hyperlink").Text
End With
End Sub
如何在两个子控件之间共享chromedriver,以便从内部页面抓取一些内容?
答案 0 :(得分:1)
您将获得一个过时的元素引用,就像您要从着陆页导航的内部子页面一样。然后,您尝试继续在外部循环中引用此页面。将链接放入字典中并循环播放。另外,传递ByVal
。
Option Explicit
Public Sub FetchLinks()
Const link$ = "https://stackoverflow.com/questions/tagged/web-scraping"
Dim driver As ChromeDriver, post As Object, key As Variant
Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary"): Set driver = New ChromeDriver
With driver
.get link
For Each post In .FindElementsByCss(".summary .question-hyperlink", timeOut:=10000)
dict(post.Attribute("href")) = 1
Next
For Each key In dict.keys
FetchData driver, key
Next key
End With
End Sub
Public Sub FetchData(ByVal driver As ChromeDriver, ByVal nlink As String)
With driver
.get nlink
Debug.Print .FindElementByCss("h1 > a.question-hyperlink").Text
End With
End Sub