如何在Chrome上使用VBA Selenium从<div>抓取<h2>文本并在SpreadSheet中插入日期?

时间:2019-02-12 15:28:27

标签: excel vba selenium web-scraping selenium-chromedriver

我一直在为一个项目学习VBA,但是我无法从html中抓取某些元素并将它们填充到excel电子表格中。

我使用的代码未返回任何错误,从我所看到的应该可以正常工作。

这是我的VBA代码:

Option Explicit

Public Sub GrabShipping()

    Dim t As Date

    Dim ele As Object

    Dim driver As New ChromeDriver

    Dim post As WebElement

    Dim i As Integer

    Dim mysheet As Worksheet


    Const MAX_WAIT_SEC As Long = 10
    Const INURL = "https://ss3.shipstation.com/#/dashboard"
    Const URL = "https://ss3.shipstation.com/"

    Set mysheet = Sheets("Main")

    With driver
        .Start "Chrome"
        .get URL
        t = Timer
        Do
            On Error Resume Next
            Set ele = .FindElementById("username")
            On Error GoTo 0
            If Timer - t > MAX_WAIT_SEC Then Exit Do
        Loop While ele Is Nothing
        If ele Is Nothing Then Exit Sub
        ele.SendKeys "Username"
        .FindElementById("password").SendKeys "Password"
        .FindElementById("btn-login").Click
     End With

     With driver
        .get INURL

        i = 2
        For Each post In driver.FindElementsByXPath("//div[contains(@class,'row-fluid stats')]")
            mysheet.Cells(i, 1) = post.FindElementByXPath(".//*[following-sibling:[contains(text(),'New Orders'").Attribute("New Orders")
            mysheet.Cells(i, 2) = post.FindElementByXPath(".//*[following-sibling:[contains(text(),'Ready to Ship'").Attribute("Ready to Ship")
            mysheet.Cells(i, 3) = post.FindElementByXPath(".//*[following-sibling:[contains(text(),'Orders Shipped'").Attribute("Orders Shipped")
        Next post


        Stop               '<==delete me later
        .Quit


    End With


End Sub

这是我要从中获取的HTML:

<div class="header row-fluid"><div class="row-fluid stats">
    <div class="col-sm-4 col-md-4 col-lg-4">
        <h2>2,318</h2>
        New Orders
    </div>
    <div class="col-sm-4 col-md-4 col-lg-4">
        <h2>53</h2>
        Ready to Ship
    </div>
    <div class="col-sm-4 col-md-4 col-lg-4">
        <h2>2,265</h2>
        Orders Shipped
    </div>
</div></div>

我期望它将s中的值返回到电子表格,但是当前,当我运行代码时,结果是什么也没有添加。

3 个答案:

答案 0 :(得分:1)

您可以使用CSS选择器组合

Dim item As Object, nodeList As Object, r As Long
Set nodeList = driver.findElementsByCss(".col-sm-4.col-md-4.col-lg-4 h2")
For each item in nodeList
    r = r + 1
    Activesheet.Cells(r,1) = item.text
Next

您可以尝试重新使用定时循环

 Dim item As Object, nodeList As Object, r As Long
 t = Timer
 Do
     Set nodeList = driver.FindElementsByCss(".col-sm-4.col-md-4.col-lg-4 h2")
     If Timer - t > MAX_WAIT_SEC Then Exit Do
 Loop While nodeList.Count = 0
 If nodeList.Count > 0 Then
     For Each item In nodeList
         r = r + 1
         ActiveSheet.Cells(r, 1) = item.Text
     Next
 End If

我建议您查看是否可以缩短css选择器,例如:

.col-sm-4 h2

答案 1 :(得分:0)

代替

post.FindElementByXPath(".//*[following-sibling:[contains(text(),'New Orders'").Attribute("New Orders")

请尝试下面的代码,如果有帮助,请告诉我。

post.FindElementByXPath("//div[@class='row-fluid stats']/div/h2").Text

答案 2 :(得分:0)

您可以使用以下xpath标识<h2>节点文本:

//div[contains(@class,'row-fluid stats')]/div/h2

但是您可能会获得多个匹配项,因为所提供的类有多个<h2>标签。

在循环时,我假设循环将从列表中一个一个地获取值,以便您可以修改

i = 2
For Each post In driver.FindElementsByXPath("//div[contains(@class,'row-fluid stats')]")
    mysheet.Cells(i, 1) = post.FindElementByXPath(".//*[following-sibling:[contains(text(),'New Orders'").Attribute("New Orders")
    mysheet.Cells(i, 2) = post.FindElementByXPath(".//*[following-sibling:[contains(text(),'Ready to Ship'").Attribute("Ready to Ship")
    mysheet.Cells(i, 3) = post.FindElementByXPath(".//*[following-sibling:[contains(text(),'Orders Shipped'").Attribute("Orders Shipped")
Next post

如下所示:

i = 2
j = 1
For Each post In driver.FindElementsByXPath("//div[contains(@class,'row-fluid stats')]/div/h2")
    mysheet.Cells(i, j) = post.FindElementByTag("h2").Text
    j = j + 1
Next post

如果上面的代码不起作用,请尝试下面的代码,它将尝试使用类名和标签名获取该文本:

i = 2
j = 1
For Each post In driver.FindElementsByClass("row-fluid stats")
    mysheet.Cells(i, j) = post.FindElementByTag("h2").Text
    j = j + 1
Next post

希望对您有帮助...