想要使用VBA,Excel和Selenium抓取从页面返回的文本

时间:2019-07-17 00:43:03

标签: excel vba selenium web-scraping excel-2010

我正在使用Excel 2010,VBA和Selenium(由于IE无法正确处理目标网站page)。所以我改用硒。

我正在执行步骤1和2。

  1. 将域名插入“ domainToCheck”文本框中
  2. 点击“ GoValue™”按钮
  3. 从结果页面获取估计值

页面返回一个评估值及其我想捕获并放入电子表格中的值。我的代码如下:

Public Sub getAppraisal()

Dim oBrowser    As New WebDriver
Dim oElement    As WebElement

Dim sEstimate   As String

Dim oCell       As Range
Dim oRng        As Range

oBrowser.Start "Firefox"

Set oRng = Range(Worksheets("sheet1").Range("A2"), Worksheets("sheet1").Range("A2").End(xlDown))

For Each oCell In oRng
    'GO TO THE APPRAISAL WEBPAGE
    oBrowser.Get "https://www.godaddy.com/domain-value-appraisal"
    'ENTER THE DOMAIN NAME TO BE APPRAISED
    oBrowser.FindElementByName("domainToCheck").SendKeys (oCell.Value)
    'CLICK THE SUBMIT BUTTON
    oBrowser.FindElementByClass("input-group-btn").Click
    oBrowser.Wait 1000
    'GRAB THE ESTIMATED VALUE
    sEstimate = oBrowser.FindElementByClass("dpp-price price").Value '<--- ERROR IS HERE "INVALID SELECTOR ERROR"

    'POPULATE THE ESTIMATE NEXT TO THE DOMAIN NAME ON THE SHEET
    oCell.Offset(0, 1).Value = CCur(sEstimate)
Next

oBrowser.Quit
Set oBrowser = Nothing

结束子

电子表格如下:

enter image description here

包含结果的HTML DIV如下:

<div class="exact-domain-result"><div class="d-block wrap-text"><h1 class="m-b-1">activefs.com</h1></div><h3><span><img src="https://img1.wsimg.com/DomainValuation/icn-godaddy-valuation.png" style="padding-bottom: 7px;"> </span><span class="dpp-price"><span class="text-muted"><span>Estimated Value:</span></span></span> <span class="dpp-price price"><strong>$1,774</strong></span> <span id="currencyLabel"></span><sup><span><span class="d-block-inline valuation-tooltip"><span> <span style="cursor: pointer; outline: medium none;" class="uxicon uxicon-help" aria-haspopup="true" role="button"></span></span></span></span></sup></h3></div>

1 个答案:

答案 0 :(得分:2)

根本原因:

如果您查看跨度html节点 dpp-price价格不是单个类,则有 dpp-price price 类。并且您正在尝试考虑具有单个类的元素,这就是为什么当硒尝试找到该元素时会出现异常的原因。

如何解决此问题:

您可以使用css选择器/ xpath查找元素,如下所示。

# css selector
.dpp-price.price
#xpath
//span[@class='dpp-price price']