我试图只提取HTML表格中最右边单元格的内部文本。这是HTML代码的一小部分。该行包含810个单元格,TR标记包含811个TD标记:
</tr><tr align="center" id="spt_inner_row_2"><td nowrap="nowrap" bgcolor="#EEEEEE" style="border-bottom: 1px solid white; border-right: 1px solid white">
300 - 305
</td><td nowrap="nowrap" bgcolor="#EEEEEE" style="border-bottom: 1px solid white; border-right: 1px solid white">
300 - 305
</td><td nowrap="nowrap" bgcolor="#EEEEEE" style="border-bottom: 1px solid white; border-right: 1px solid white">
300 - 305
</td><td nowrap="nowrap" bgcolor="#EEEEEE" style="border-bottom: 1px solid white; border-right: 1px solid white">
300 - 305
我目前使用的代码成功地从每个单元格中提取数据并将其粘贴到活动工作表的A列中:
Sub GetData()
Dim URL As String
Dim IE As InternetExplorer
Dim HTMLdoc As HTMLDocument
Dim TDelements As IHTMLElementCollection
Dim TDelement As HTMLTableCell
Dim r As Long
'For login use
Dim LoginForm As HTMLFormElement
Dim UserNameInputBox As HTMLInputElement
Dim PasswordInputBox As HTMLInputElement
URL = "https://www.whatever.com"
Set IE = New InternetExplorer
With IE
.navigate URL
.Visible = True
'Wait for page to load
While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend
Set HTMLdoc = .document
'Enter login info
Set LoginForm = HTMLdoc.forms(0)
'Username
Set UserNameInputBox = LoginForm.elements("username")
UserNameInputBox.Value = "username"
'Password
Set PasswordInputBox = LoginForm.elements("password")
PasswordInputBox.Value = "password"
'Get the form input button and click it
Set SignInButton = LoginForm.elements("doLogin")
SignInButton.Click
'Wait for the new page to load
Do While IE.readyState <> READYSTATE_COMPLETE Or IE.Busy: DoEvents: Loop
'Auto-navigate to start page, so we need to navigate once more
.navigate URL
Do While IE.readyState <> READYSTATE_COMPLETE Or IE.Busy: DoEvents: Loop
End With
'Specify how to recognize data to extract
Set TDelements = HTMLdoc.getElementById("spt_inner_row_2").getElementsByTagName("TD")
r = 0
For Each TDelement In TDelements
ActiveSheet.Range("A1").Offset(r, 0).Value = TDelement.innerText
r = r + 1
Next
End Sub
我真正需要的是只提取HTML表格行中的最后一个(最右边)单元格。有什么建议吗?
答案 0 :(得分:0)
IHTMLElementCollection
具有length
属性和item
属性。 item
属性可以采用数字索引,但是从零开始,因此最后一个条目位于length - 1
Dim TDelements As IHTMLElementCollection
Set TDelements = HTMLdoc.getElementById("spt_inner_row_2").getElementsByTagName("TD")
With TDelements
MsgBox .Item(.Length - 1).InnerText
End With