我正在构建我的第一个VBA代码,以从网站中抓取数据。我可以打开站点并浏览按钮以在屏幕上获取正确的数据,但是我很难引用正确的表来遍历。我想访问嵌入式表“活动”。为此,我从here那里获得了有关如何循环访问表和提取信息的答案,并将其嵌入到我的代码中。以下是出现错误的三个区域。
这些链接了吗(尤其是查询B&C),有人有什么想法吗?
非常感谢!
------解决方案代码(来自下面的QHarr答案)-------------
注意:需要参考(VBE>工具>参考并向其中添加参考): Microsoft Internet控件 Microsoft HTML对象库
Public Sub GetTable()
Dim IE As InternetExplorer, ele As Object, clipboard As Object, hTable As htmlTable, t As Date, ws As Worksheet
Set ws = ThisWorkbook.Worksheets("Sheet1")
Const MAX_WAIT_SEC As Long = 20
Set IE = New InternetExplorer
With IE
.Visible = True
.navigate "https://na3.docusign.net/Member/EmailStart.aspx?a=59595fcb-34be-4375-b880-a0be581d0f37&r=f6d28b49-e66d-4fa4-a7e9-69c2c741fde5"
While .Busy Or .readyState < 4: DoEvents: Wend
t = Timer
Do
DoEvents
On Error Resume Next
Set ele = .Document.querySelector("[data-qa='show-history']")
'On Error GoTo 0 'I removed this line as it was throwing an error as soon as the 'Show-history' element loaded.
If Timer - t > MAX_WAIT_SEC Then Exit Do
Loop While ele Is Nothing
If ele Is Nothing Then Exit Sub
ele.Click
While .Busy Or .readyState < 4: DoEvents: Wend
Set hTable = .Document.querySelector("[data-qa='history-dialog-audit-logs']")
''**********************************************************************
'' Loop table and write out method. This method uses the sub WriteTable
Application.ScreenUpdating = False
WriteTable hTable, 1, ws
Application.ScreenUpdating = True
''**********************************************************************
.Quit
End With
End Sub
Public Sub WriteTable(ByVal hTable As htmlTable, Optional ByVal startRow As Long = 1, Optional ByVal ws As Worksheet)
If ws Is Nothing Then Set ws = ActiveSheet
Dim tRow As Object, tCell As Object, tr As Object, td As Object, r As Long, c As Long
r = startRow
With ws
Set tRow = hTable.getElementsByTagName("tr")
For Each tr In tRow
Set tCell = tr.getElementsByTagName("td")
c = 1
For Each td In tCell
.Cells(r, c).Value = td.innerText
c = c + 1
Next td
r = r + 1
Next tr
End With
End Sub
------原始查询代码-------
查询A: 我在加载页面时收到一个对象必需的错误,如果继续执行脚本,该错误将消失,因此我认为处理加载时间是否有问题?它在“循环”代码完成后发生:
With objIE
.Visible = True
.navigate WebSite
Do While .Busy Or .readyState <> 4
DoEvents
Loop
.document.querySelector("[data-qa='show-history']").Click
查询B: 我在此行上收到另一个对象必需的错误,也可以继续过去:
For Each ele In objIE.document.getElementById("activity").getElementsByTagName("tr")
查询C: 我在下一行出现下标超出范围的错误,并且无法继续进行
Sheets(“ Sheet1”)。Range(“ A”&y).Value = ele.Children(0).textContent
Sub googlesearch3()
Set objIE = CreateObject("InternetExplorer.Application")
WebSite = "websiteurl"
With objIE
.Visible = True
.navigate WebSite
Do While .Busy Or .readyState <> 4
DoEvents
Loop
.document.querySelector("[data-qa='show-history']").Click
End With
'within the 'history-dialog-audit-logs' tabe, loop and extract data
'we will output data to excel, starting on row 1
y = 1
'look at all the 'tr' elements in the 'table' with id 'myTable',
'and evaluate each, one at a time, using 'ele' variable
For Each ele In objIE.document.getElementById("activity").getElementsByTagName("tr")
'show the text content of 'tr' element being looked at
Debug.Print ele.textContent
'each 'tr' (table row) element contains 4 children ('td') elements
'put text of 1st 'td' in col A
Sheets("Sheet1").Range("A" & y).Value = ele.Children(0).textContent
'put text of 2nd 'td' in col B
Sheets("Sheet1").Range("B" & y).Value = ele.Children(1).textContent
'put text of 3rd 'td' in col C
Sheets("Sheet1").Range("C" & y).Value = ele.Children(2).textContent
'put text of 4th 'td' in col D
Sheets("Sheet1").Range("D" & y).Value = ele.Children(3).textContent
'increment row counter by 1
y = y + 1
'repeat until last ele has been evaluated
Next
'check if word 'completed' is mentoined anwhere, if so update 'Status' to 'Completed' and search for text.
'Find "signed the envelope" and show all text before this until you find <td?. Stop after one occurance
'store text in 'LastSigned'string
'find "sent an invitation to" and show all text before this until you find <td>. Stop after one occurance
'store text in 'CurrentlyWith' sting
Set IE = Nothing
End Sub
其他: 我尝试了答案here,但DIM语句不起作用...
答案 0 :(得分:1)
这里有两种写出表格的方法。一种是使用剪贴板,另一种是通过在行内循环行和表单元格(该版本被注释掉-3行)。我使用的循环时间为MAX_WAIT_SEC
秒,以允许设置clickable元素来尝试解决您的问题1。对于我来说,没有足够的HTML来对问题2和问题3给出很好的解释。一开始可能都与计时问题有关。
注意:通常,在.Click
之后,您希望另一个While .Busy Or .readyState < 4: DoEvents: Wend
,并可能另一个Do Loop
,以便更新页面内容。
Option Explicit
Public Sub GetTable()
Dim IE As InternetExplorer, ele As Object, clipboard As Object, hTable As HTMLTable, t As Date, ws As Worksheet
Set ws = ThisWorkbook.Worksheets("Sheet1")
Const MAX_WAIT_SEC As Long = 5
Set IE = New InternetExplorer
With IE
.Visible = True
.navigate "yourURL"
While .Busy Or .readyState < 4: DoEvents: Wend
t = Timer
Do
DoEvents
On Error Resume Next
Set ele = .Document.querySelector("[data-qa='show-history']")
On Error GoTo 0
If Timer - t > MAX_WAIT_SEC Then Exit Do
Loop While ele Is Nothing
If ele Is Nothing Then Exit Sub
ele.Click
While .Busy Or .readyState < 4: DoEvents: Wend
Set hTable = .Document.querySelector("#activity .dstable")
''*********************************************************************
''Copy table to clipboard and paste method
Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
clipboard.SetText = hTable.outerHTML
clipboard.PutInClipboard
ws.Cells(1, 1).PasteSpecial
''**********************************************************************
''**********************************************************************
'' Loop table and write out method. This method uses the sub WriteTable
' Application.ScreenUpdating = False '<==Uncomment these 3 lines and comment out lines above if using this method.
' WriteTable hTable, 1, ws
' Application.ScreenUpdating = True
''**********************************************************************
.Quit
End With
End Sub
Public Sub WriteTable(ByVal hTable As HTMLTable, Optional ByVal startRow As Long = 1, Optional ByVal ws As Worksheet)
If ws Is Nothing Then Set ws = ActiveSheet
Dim tRow As Object, tCell As Object, tr As Object, td As Object, r As Long, c As Long
r = startRow
With ws
Set tRow = hTable.getElementsByTagName("tr")
For Each tr In tRow
Set tCell = tr.getElementsByTagName("td")
c = 1
For Each td In tCell
.Cells(r, c).Value = td.innerText
c = c + 1
Next td
r = r + 1
Next tr
End With
End Sub
参考(VBE>工具>参考并向其中添加参考):
编辑:在某些情况下,后期绑定剪贴板引用现在似乎有问题。这是通用的早期绑定方法,其中hTable是目标HTMLTable对象。
对于剪贴板早期绑定,请访问VBE>工具>参考> Microsoft-Forms 2.0对象库。
如果将UserForm添加到项目中,该库将自动添加。
Dim clipboard As DataObject
Set clipboard = New DataObject
clipboard.SetText hTable.outerHTML
clipboard.PutInClipboard
ThisWorkbook.Worksheets("Sheet1").Cells(1, 1).PasteSpecial