在两个或多个条件下搜索innertext

时间:2018-02-10 20:19:35

标签: html vba excel-vba web-scraping excel

我正在从excel VBA进行谷歌搜索。我有兴趣提取的文本位于span标记内:

 <div class="f kv_Swb" style="white-space:nowrap">
   ...
   <span class="st">
     <span class="f">no relevant text</span>
     this is the text it matters, it has a keyword i need
   </span>
 </div>

有许多嵌套的div标签。

它是元素类st内的字符串,但在元素类f之外。正如我所说,我使用了这样的VBA脚本:

 Dim IE as Object
 Dim doc as Object
 Dim elementA as Object
 Dim elementB as Object
 Dim TagA as Object
 Dim TagB as Object

 Set IE = CreateObject("InternetExplorer.Application")
 IE.Navigate "http://www.unsuspectwebpage.com/about"
 Set doc = IE.Document

 Do Until IE.ReadyState = 4
   DoEvents
 Loop

 Set TagA = doc.getElementsByTagName("span")
 For Each elementA In TagA 
   Set TagB = doc.getElementsByClassName("st")
   For Each elementB In TagB
     ws.Range("A1") = ws.Range("A1") & elementB.innertext
   Next elementB
 Next elementA

如何获取课程st但在课程f之外的文字?

1 个答案:

答案 0 :(得分:3)

不是一个非常有效的,但它应该获取所需的内容:

Dim elem As Object, HTML As New HTMLDocument

For Each elem In HTML.getElementsByClassName("st")
    Debug.Print Split(elem.innerText, elem.getElementsByTagName("span")(0).innerText)(1)
Next elem

输出:

this is the text it matters, it has a keyword i need