是否可以将使用Selenium(使用Excel VBA)抓取的HTML源存储到HTMLDocument
元素中?
这是使用Microsoft Internet Controls
和Microsoft HTML Object Library
自动化Internet Explorer的示例。
Dim IE as InternetExplorer
Dim HTML as HTMLDocument
Set IE = New InternetExplorer
ie.navigate "www.google.com"
set HTML = IE.Document
与Selenium一样可以圆顶吗?例如(不工作!):</ p>
Dim selenium As SeleniumWrapper.WebDriver
Set selenium = New SeleniumWrapper.WebDriver
Dim html as HTMLDocument
selenium.Start "firefox", "about:blank"
selenium.Open "file:///D:/webpages/LE_1001.htm"
Set html = selenium.getHtmlSource 'this is not working since .getHtmlSource()
'returns a String object but is there a way to store
'this html source into a type of HTMLDocument-element
答案 0 :(得分:1)
这应该可以使用字符串作为HTML文档的源:
Set html = New HTMLDocument
html.body.innerHTML = selenium.pageSource
编辑:从getHtmlSource更改了Selenium对pageSource的调用。完整的工作代码如下。不确定我们是否使用相同版本的Selenium:
Option Explicit
Sub foo()
Dim sel As selenium.WebDriver
Set sel = New selenium.WebDriver
Dim html As HTMLDocument
sel.Start "firefox", "about:blank"
sel.Get "http://www.google.com/"
Set html = New HTMLDocument
html.body.innerHTML = sel.PageSource
Debug.Print html.body.innerText
End Sub
引用Microsoft HTML Object Library和Selenium Type Library(Selenium32.tlb) - 使用SeleniumBasic版本2.0.6.0
答案 1 :(得分:1)
使用SeleniumBasic获取DOM的正确方法:
Sub Get_DOM()
Dim driver As New FirefoxDriver
driver.Get "https://en.wikipedia.org/wiki/Main_Page"
Dim html As New HTMLDocument ' Requires Microsoft HTML Library
html.body.innerHTML = driver.ExecuteScript("return document.body.innerHTML;")
Debug.Print html.body.innerText
driver.Quit
End Sub
要使用上面的示例获取最新版本的日期: https://github.com/florentbr/SeleniumBasic/releases/latest
答案 2 :(得分:0)
不太确定为什么要将Selenium元素转换为HTMLDocument。它需要一个更有限的依赖项目。
我个人更喜欢将DOM-element分配给WebElement。例如:
If (Selenium.FindElementsByClass("qty").Count > 0) Then
Dim qtyElement as WebElement: Set qtyElement = Selenium.FindElementByClass("qty")
End If
If (Not qtyElement is Nothing) then
Dim qtyHtml as String: qtyHtml = qrtElement.Attribute("innerHTML")
End if
Debug.Print qtyHtml