VBA Web自动化:从表/或标记名(td)中抓取非文本

时间:2017-05-25 15:08:57

标签: vba web automation scrape




'start a new subroutine called SearchBot
Sub soccer_stats()
    'dimension (declare or set aside memory for) our variables
    Dim objIE As InternetExplorer 'special object variable representing the IE browser
    Dim aEle As HTMLLinkElement 'special object variable for an <a> (link) element
    Dim y As Integer 'integer variable we'll use as a counter
    Dim result As String 'string variable that will hold our result link
    Dim Variable1 As String
 Variable1 = InputBox("put in what you are searching")
    'initiating a new instance of Internet Explorer and asigning it to objIE
    Set objIE = New InternetExplorer
    'make IE browser visible (False would allow IE to run in the background)
    objIE.Visible = True
    'navigate IE to this web page (a pretty neat search engine really)
    objIE.navigate "http://www.soccerstats.com/"
    'wait here a few seconds while the browser is busy
    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
Dim ele As Object

For Each ele In objIE.document.getElementsByTagName("input")
    If ele.Name = "searchstring" Then
        ele.Value = Variable1
    End If
Next ele

For Each ele In objIE.document.getElementsByTagName("input")
    If ele.className = "submit" Then
    End If
Next ele

    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop

For Each ele In objIE.document.getElementsByTagName("a")
    If ele.innerText = Variable1 Then
    End If
Next ele

    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop

    'new bit
    y = 2
  For Each ele In objIE.document.getElementsByTagName("td")
        '...get the innertext and print it to the sheet in col A, row y
        result = ele
        Sheets("Sheet2").Range("A" & y).Value = ele.innerText
  y = y + 1
  Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
End Sub


表格的第一列有html: <td height=”18” align=”right”> 14 Aug</td

我可以将我的代码更改为For Each ele在objIE.document.getElementsByTagName(“td”)AND height =“18?

并且对于表格中的下一列,html代码没有高度,所以我可以将其更改为 “For each ele in objIE.document.getElementsByTagName(”td“)AND height = null?



网页中每列的html为: 日期栏:

<td height=”18” align=”right”> 14 Aug</td


<td align=”right”><b>Arsenal</b></td>


   <td width=”45 align=”center”>
<a class=”tooltip2” href=”#”>
<font color=”#0000aa”>
<b>3 – 4</b>


    <td align="left">

0 个答案:
