我一直试图从网站soccerstats网上抓取数据,特别是足球队“阿森纳的结果(http://www.soccerstats.com/team.asp?league=england&teamid=15)
(网页上有几个表格,我在最大表格中的数据之后)
我当前的代码从任何td标签中删除了一个混乱的内容:
'start a new subroutine called SearchBot
Sub soccer_stats()
'dimension (declare or set aside memory for) our variables
Dim objIE As InternetExplorer 'special object variable representing the IE browser
Dim aEle As HTMLLinkElement 'special object variable for an <a> (link) element
Dim y As Integer 'integer variable we'll use as a counter
Dim result As String 'string variable that will hold our result link
Dim Variable1 As String
Variable1 = InputBox("put in what you are searching")
'initiating a new instance of Internet Explorer and asigning it to objIE
Set objIE = New InternetExplorer
'make IE browser visible (False would allow IE to run in the background)
objIE.Visible = True
'navigate IE to this web page (a pretty neat search engine really)
objIE.navigate "http://www.soccerstats.com/"
'wait here a few seconds while the browser is busy
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
Dim ele As Object
For Each ele In objIE.document.getElementsByTagName("input")
If ele.Name = "searchstring" Then
ele.Value = Variable1
End If
Next ele
For Each ele In objIE.document.getElementsByTagName("input")
If ele.className = "submit" Then
ele.Click
End If
Next ele
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
For Each ele In objIE.document.getElementsByTagName("a")
If ele.innerText = Variable1 Then
ele.Click
End If
Next ele
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
'new bit
y = 2
For Each ele In objIE.document.getElementsByTagName("td")
'...get the innertext and print it to the sheet in col A, row y
result = ele
Sheets("Sheet2").Range("A" & y).Value = ele.innerText
y = y + 1
Next
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
End Sub
如果符合条件i,ii,iii,iv,有没有办法将内部文本粘贴到A,B,C,D行?
表格的第一列有html:
<td height=”18” align=”right”> 14 Aug</td
我可以将我的代码更改为For Each ele在objIE.document.getElementsByTagName(“td”)AND height =“18?
并且对于表格中的下一列,html代码没有高度,所以我可以将其更改为 “For each ele in objIE.document.getElementsByTagName(”td“)AND height = null?
还是有更好的方法刮掉整个桌子?谢谢你的帮助
编辑:
网页中每列的html为: 日期栏:
<td height=”18” align=”right”> 14 Aug</td
主队列:
<td align=”right”><b>Arsenal</b></td>
得分栏:
<td width=”45 align=”center”>
<a class=”tooltip2” href=”#”>
<font color=”#0000aa”>
<b>3 – 4</b>
离开球队专栏:
<td align="left">
Liverpool
</td>