从网站解析数据

时间:2017-11-10 15:16:18

标签: excel vba excel-vba

我正在尝试为excel创建一个新函数,其中用户基本上将Serialvalue输入Cell,然后在另一个Cell中输入=GetPartNumber(SerialValue),它将返回与该序列值相关联的部件号。

为了获得部件号,我必须在内部Intranet站点上进行搜索,我已经使用了这部分,能够发送序列然后单击提交按钮,并使用正确的数据重新加载站点,它只是解析我坚持的结果..

Public Function GetPartNumber(UnitSerialNumber)

Dim ie As Object

Set ie = CreateObject("InternetExplorer.Application")

With ie
.MenuBar = 0
.Toolbar = 0
.StatusBar = 0
.Navigate " https://XXXXXXXXXXXXXXX/basicUnitData.faces"
.Visible = 0

End With

'wait a while until IE as finished to load
Do While ie.ReadyState = 4: DoEvents: Loop   'Do While
Do Until ie.ReadyState = 4: DoEvents: Loop   'Do Until

' This bit submits the Serial to the site and clicks the go button
With ie.document.all
    .Item("unitDataSearchForm:serialNumber").Value = UnitSerialNumber
End With
ie.document.all("unitDataSearchForm:findUnitData").Click

' Another wait as the site is reloading with the results
Do While ie.ReadyState = 4: DoEvents: Loop   'Do While
Do Until ie.ReadyState = 4: DoEvents: Loop   'Do Until

' Now here is the bit i am stuck at
GetPartNumber = "Ready to get PN!"


EndoftheSub:

Set ie = Nothing
MsgBox (GetPartNumber)

End Function

结果页面只是一个简单的表格,我需要抓取的是第二行中的第三个单元格

<table id="unitDataSearchForm:outputTable" border="1" cellpadding="2" cellspacing="2" class="tablebg">
<thead>
<tr><th scope="col">Serial Number</th><th scope="col">Other Data</th><th scope="col">Part Number</th><th scope="col">BLAH</th><th scope="col">BLAHAH</th></tr></thead>
<tbody id="unitDataSearchForm:outputTable:tbody_element">
<tr class="oddRows"><td>XXX123456</td><td>YYYY</td><td>ZZZ-ZZZZZ-ZZ</td><td>SOME TEXT</td><td>BLAHAH</td>
</tr></tbody></table>

我想要的比特只是ZZZ-ZZZZZ-ZZ,所以我想要GetPartNumber =第2行,第3个TD元素..但我不知道如何输出。

网站本身是静态的,因此输出页面不会改变,表格将始终采用该格式。我尝试了一些方法,但似乎没有任何工作,我只是偶尔涉足VBA,所以我远非专家。

任何帮助都会很棒

1 个答案:

答案 0 :(得分:0)

尝试:

Dim htmlObj as HTMLObjectElement 'requires setting a reference to Microsoft HTML Object Library

Set htmlObj = ie.document.GetElementByID("unitDataSearchForm:outputTable:tbody_element")
GetPartNumber = htmlObj.GetElementsByTagName("TR")(1).Children(2).InnerText

顺便说一下,这段代码总是会跳过第一行:

Do While ie.ReadyState = 4: DoEvents: Loop   'Do While
Do Until ie.ReadyState = 4: DoEvents: Loop   'Do Until

此外它没有ie.Busy检查,它应该是这样的:

Do Until ie.readyState = 4 and IE.Busy = False: DoEvents: Loop