我想从div
元素内的表中抓取信息。
当id不是div元素时,我尝试并成功设法提取了信息。当我尝试获取div元素的ID时,它显示:
错误13:类型不匹配
Sub Test1()
Dim IE As Object
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
IE.navigate "http://www.concorindia.com/containerquery.aspx"
Do While IE.Busy
Application.Wait DateAdd("s", 1, Now)
Loop
Set Doc = IE.document
IE.document.getElementById("contno").Value = ThisWorkbook.Sheets("Status").Range("B3").Value
Doc.getElementById("CONTButton1").Click
Set Data = Doc.getElementById("PPosition")
End Sub
我打算首先获取div id“ PPosition”内的所有数据来提取div元素内的信息,但消息框显示错误13:类型不匹配。
有人可以帮我获取上述代码表中的信息吗,例如火车号码,发车状态等。
样品容器编号-TCNU4171692
代码中还提到了打算从中废弃数据的网站。 (http://www.concorindia.com/containerquery.aspx)
答案 0 :(得分:1)
这是将整个HTML表打印输出到工作表中的一般方法:
<!DOCTYPE html>
<html>
<head>
<script data-require="angular.js@1.6.6" data-semver="1.6.6" src="https://ajax.googleapis.com/ajax/libs/angularjs/1.6.6/angular.min.js"></script>
<script src="https://angular-ui.github.io/bootstrap/ui-bootstrap-tpls-0.6.0.js" type="text/javascript"></script>
</head>
<body ng-app="myApp">
<div ng-controller="test">
<table>
<tr>
<th>column 1</th>
<th>column 2</th>
<th>column 3</th>
</tr>
<tr data-ng-repeat="x in new">
<td>
<h1>{{x[0]}}</h1>
</td>
<td>
<h1>{{x[1]}}</h1>
</td>
<td>
<h1>{{x[2]}}</h1>
</td>
</tr>
</table>
</div>
</body>
</html>
使用的参考:Sub ScrapeContainerInfo()
Dim req As New WinHttpRequest
Dim doc As New HTMLDocument
Dim div As HTMLDivElement
Dim table As HTMLTable
Dim tableRow As HTMLTableRow
Dim tableCell As HTMLTableCell
Dim sht As Worksheet
Dim i As Long, j As Long
Dim url As String, containerNumber As String, reqBody As String
Set sht = ThisWorkbook.Worksheets("Sheet2")
containerNumber = "TCNU4171692"
url = "http://www.concorindia.com/containerquery.aspx"
reqBody = "__VIEWSTATE=%2FwEPDwULLTE1Njk0Mzk4MzkPZBYCAgoPZBYEAgEPDxYCHgdWaXNpYmxlaGRkAgMPZBYEAgMPEGRkFgFmZAIFDw9kFgIeB29uY2xpY2sFIWphdmFzY3JpcHQ6ZXJyPXRlc3QoKTtyZXR1cm4gZXJyO2RkS1KgJsS2Kb22YOy%2FEN0XTBRc8lY%3D&__EVENTVALIDATION=%2FwEWBgKk%2BrO6AwKhk42ICgKmqIGHDAKbyfWzBQLvyamyBQKxlra5AfFIxQQ%2BvdUNsDciaOk4g0LyycSG&contno=" & containerNumber & "&drpimpexp=Any&CONTButton1=Submit+Query"
With req
.Open "POST", url, False
.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
.send reqBody
doc.body.innerHTML = .responseText
End With
Set div = doc.getElementById("PPosition")
Set table = div.getElementsByTagName("table")(0)
i = 1
For Each tableRow In table.Rows
i = i + 1
j = 1
For Each tableCell In tableRow.Cells
j = j + 1
sht.Cells(i, j) = tableCell.innerText
Next tableCell
Next tableRow
End Sub
和Microsoft HTML Object Library
输出如下:
现在,如果您想以更有针对性的方式访问表的信息,可以这样:
Micrsoft WinHTTP Services Version 5.1
上面的代码在立即窗口中打印表第二行的第一单元格。您可以相应地对其进行修改以访问任何单元格,请记住索引从Debug.Print table.Rows(1).Cells(0).innerText
开始。
编辑
我错误地认为获取实际的HTML响应不是问题,但是由于显然如此,因此我更新了代码以包括需要发送的HTTP请求。我尽量避免使用IE。
我已经硬编码了一个特定的容器号。可以轻松地对其进行修改以遍历多个容器编号。