我试图声明一个节点数组(这不是问题),然后在数组的每个元素中刮掉两个子节点的innerHTML
- 以SE为例(使用IE
对象方法),假设我试图在主页上抓取标题和问题提取,有一个节点数组(类名:&#34; question-summary < / EM>&#34;。)
然后是两个子节点(tile - 类名:&#34; question-hyperlink &#34;以及提取 - 类名:&#34; 摘录&#34;)我使用的代码如下:
Sub Scraper()
Dim ie As Object
Dim doc As Object, oQuestionShells As Object, oQuestionTitle As Object, oQuestion As Object, oElement As Object
Dim QuestionShell As String, QuestionTitle As String, Question As String, sURL As String
Set ie = CreateObject("internetexplorer.application")
sURL = "https://stackoverflow.com/questions/tagged/excel-formula"
QuestionShell = "question-summary"
QuestionTitle = "question-hyperlink"
Question = "excerpt"
With ie
.Visible = False
.Navigate sURL
End With
Set doc = ie.Document 'Stepping through so doc is getting assigned (READY_STATE = 4)
Set oQuestionShells = doc.getElementsByClassName(QuestionShell)
For Each oElement In oQuestionShells
Set oQuestionTitle = oElement.getElementByClassName(QuestionTitle) 'Assigning this object causes an "Object doesn't support this property or method"
Set oQuestion = oElement.getElementByClassName(Question) 'Assigning this object causes an "Object doesn't support this property or method"
Debug.Print oQuestionTitle.innerHTML
Debug.Print oQuestion.innerHTML
Next
End Sub
答案 0 :(得分:2)
getElementByClassName
不是一种方法。
您只能使用getElementsByClassName
(请注意方法名称中的复数形式),它会返回IHTMLElementCollection
。
使用Object
代替IHTMLElementCollection
很好 - 但您仍然需要通过提供索引来访问集合中的特定元素。
假设每个oElement
只有一个类question-summary
的实例和一个类question-hyperlink
的实例。然后你可以使用getElementsByClassName
并在末尾使用(0)
来拉出返回的数组的第一个元素。
所以你的代码更正是:
Set oQuestionTitle = oElement.getElementsByClassName(QuestionTitle)(0)
Set oQuestion = oElement.getElementsByClassName(Question)(0)
完整的工作代码(有一些更新,即使用Option Explicit
并等待页面加载):
Option Explicit
Sub Scraper()
Dim ie As Object
Dim doc As Object, oQuestionShells As Object, oQuestionTitle As Object, oQuestion As Object, oElement As Object
Dim QuestionShell As String, QuestionTitle As String, Question As String, sURL As String
Set ie = CreateObject("internetexplorer.application")
sURL = "https://stackoverflow.com/questions/tagged/excel-formula"
QuestionShell = "question-summary"
QuestionTitle = "question-hyperlink"
Question = "excerpt"
With ie
.Visible = True
.Navigate sURL
Do
DoEvents
Loop While .ReadyState < 4 Or .Busy
End With
Set doc = ie.Document
Set oQuestionShells = doc.getElementsByClassName(QuestionShell)
For Each oElement In oQuestionShells
'Debug.Print TypeName(oElement)
Set oQuestionTitle = oElement.getElementsByClassName(QuestionTitle)(0)
Set oQuestion = oElement.getElementsByClassName(Question)(0)
Debug.Print oQuestionTitle.innerHTML
Debug.Print oQuestion.innerHTML
Next
ie.Quit
End Sub