XPATH - 查找子项具有特定值的元素

时间:2018-06-07 14:05:30

标签: excel vba xpath web-scraping

我想用xpath找到一个特定的值,并将它写在我的excel表中。

enter image description here

该值应写入我的excel表。

//编辑:H3类是此站点上唯一的H3类

 <div class="col-md-6 col-xs-12" id="infoBox">
<div class="col-xs-12 thumbnail thumbnail-more">
<div class="text-center hidden-sm hidden-xs" style="margin-top:-22px">
<H1 itemprop="name" class="titleDesktop toUpperCase">Tygra</H1>
<H2 class="productTypeText">Product Type: Funko Pop! Vinyl</H2>
</div>
<div class="clear"></div>
<div class="col-xs-6 text-center">
<div class="progress-bar progress-bar-success" role="progressbar"></div>
<H5><i data-toggle="tooltip" class="fa fa-question-circle-o" data-html="true" title="Value is updated daily from recent eBay sales. See <a style='color:white' href='faqcalc'>FAQ</a> for more info. Last update: 6 June 2018"></i> Trending at:</H5>
<H3 class="valueText">$10</H3>
</div>
<div class="col-xs-6 text-center">
<div class="progress-bar progress-bar-info" role="progressbar"></div>
<H5>#573</H5>
<H5>Release: Dec 2017</H5>
</div>
<div class="clear"><br></div>
<div class="col-xs-6 text-center">
<div class="progress-bar progress-bar-warning" role="progressbar"></div>
<h5>See more:</h5>
<div class="col-xs-12 no-gutter opacityHover">
<a href="/funko/all/thundercats"><div class="col-md-8 col-md-offset-2 col-xs-12 no-gutter" itemprop="category"><img class="img-responsive img-rounded img-center" src="/img/category/thundercats.png" alt="See more in Thundercats"><div class="no-gutter">Thundercats</div></div></a>
</div>

1 个答案:

答案 0 :(得分:0)

CSS选择器方法:

元素的确切CSS选择器是:

#infoBox > div.col-xs-12.thumbnail.thumbnail-more > div:nth-child(3) > h3

CSS query

您可以简化为:

CSS query 2

甚至

CSS query 3

一旦获得HTML文档,就可以使用以下任何选择器访问:

.querySelector("#infoBox > div.col-xs-12.thumbnail.thumbnail-more > div:nth-child(3) > h3").InnerText
.querySelector("#infoBox *> div:nth-child(3) > h3").InnerText
.querySelector(".valueText").InnerText

以下是使用Internet Explorer和上面显示的CSS选择器之一提取当前价格(10美元)的示例:

代码:

Option Explicit

Public Sub GetInfo()
    Dim IE As Object
    Dim html As Object

    With CreateObject("InternetExplorer.Application")
        .Visible = True
        .navigate "https://stashpedia.com/funko/pop-vinyl/thundercats/tygra-exclusive-573"

        While .Busy Or .readyState < 4: DoEvents: Wend

        Set html = .document

        With html
           Debug.Print .querySelector("#infoBox *> div:nth-child(3) > h3").innerText
        End With

        Stop
        'Quit '<== Remember to quit application
    End With
End Sub

CSS选择器here的信息。

XPath方法:

如果您想使用上面显示的方法不支持的XPath,您可以使用Selenium basic,您的脚本可能如下:

Option Explicit
Public Sub GetInfoSel()
    Dim d As WebDriver
    Set d = New ChromeDriver
    Const URL = "https://stashpedia.com/funko/pop-vinyl/thundercats/tygra-exclusive-573"

    With d
        .Start "Chrome"
        .Get URL
       Debug.Print .FindElementByXPath("//*[@id=""infoBox""]/div[1]/div[3]/h3").Text
        'Quit
    End With
End Sub

注意:下载并安装后,您需要转到工具&gt; VBE中的引用并添加对Selenium Type库的引用

Selenium reference