我正在从excel VBA进行谷歌搜索。我有兴趣提取的文本位于span标记内:
<div class="f kv_Swb" style="white-space:nowrap">
...
<span class="st">
<span class="f">no relevant text</span>
this is the text it matters, it has a keyword i need
</span>
</div>
有许多嵌套的div标签。
它是元素类st
内的字符串,但在元素类f
之外。正如我所说,我使用了这样的VBA脚本:
Dim IE as Object
Dim doc as Object
Dim elementA as Object
Dim elementB as Object
Dim TagA as Object
Dim TagB as Object
Set IE = CreateObject("InternetExplorer.Application")
IE.Navigate "http://www.unsuspectwebpage.com/about"
Set doc = IE.Document
Do Until IE.ReadyState = 4
DoEvents
Loop
Set TagA = doc.getElementsByTagName("span")
For Each elementA In TagA
Set TagB = doc.getElementsByClassName("st")
For Each elementB In TagB
ws.Range("A1") = ws.Range("A1") & elementB.innertext
Next elementB
Next elementA
如何获取课程st
但在课程f
之外的文字?
答案 0 :(得分:3)
不是一个非常有效的,但它应该获取所需的内容:
Dim elem As Object, HTML As New HTMLDocument
For Each elem In HTML.getElementsByClassName("st")
Debug.Print Split(elem.innerText, elem.getElementsByTagName("span")(0).innerText)(1)
Next elem
输出:
this is the text it matters, it has a keyword i need