我有一个与HTML解析有关的问题。我有一个包含一些产品的网站,我想将图像中的URL捕获到我当前的电子表格中。 这个电子表格很大但在第3列中包含ItemNbr,我希望第27列中的URL和一行对应一个产品(item)。
我的想法是获取'常规'或者'大' OR' verylarge'图像(它并不重要)。这是网站的结构(在其他各种div中):
<div id="MainDisplay" class="miMaindisplay">
<a href="http://www.example.com/verylarge/12425/nl" id="ctl00_PageContent_MultiImage_jqzoom" class="loupe">
<div class="zoomPad">
<img src="http://www.example.com/regular/12425/nl" id="ctl00_PageContent_MultiImage_PreviewImage" class="miPreviewImage">
<div class="zoomPup"></div>
<div class="zoomWindow">
<div class="zoomWrapper">
<div class="zoomWrapperTitle"></div>
<div class="zoomWrapperImage">
<img src="http://www.example.com/large/12425/nl">
</div>
</div>
</div>
<div class="zoomPreload">Loading zoom</div>
</div>
</a>
</div>
我可以使用以下行获取JS控制台中的URL:
document.getElementById('ctl00_PageContent_MultiImage_PreviewImage').src;
答案是:
http://www.example.com/regular/12425/nl
但在VBA中没有成功。这是我的代码片段:
Sub ParseImage()
Dim Cell As Integer
Dim ItemNbr As String
Dim AElement As Object
Dim AElements As IHTMLElementCollection
Dim IE As MSXML2.XMLHTTP60
Set IE = New MSXML2.XMLHTTP60
Dim HTMLDoc As MSHTML.HTMLDocument
Dim HTMLBody As MSHTML.HTMLBody
Set HTMLDoc = New MSHTML.HTMLDocument
Set HTMLBody = HTMLDoc.body
For Cell = 1 To 5 'I iterate through the file row by row
ItemNbr = Cells(Cell, 3).Value 'ItemNbr are in the 3rd Column of my spreadsheet
IE.Open "GET", "http://www.example.com/?item=" & ItemNbr, False
IE.send
While IE.ReadyState <> 4
DoEvents
Wend
HTMLBody.innerHTML = IE.responseText
Set AElements = HTMLDoc.getElementsByTagName("a")
For Each AElement In AElements
If AElement.id = "ctl00_PageContent_MultiImage_PreviewImage" Then
Cells(Cell, 27) = AElement.src 'I write URL in the 27th column
End If
Next AElement
Application.Wait (Now + TimeValue("0:00:2"))
Next Cell
End Sub
我显然提供了一些参考资料如下:
感谢您的帮助!
答案 0 :(得分:1)
如果您要定位的元素由HTML页面中的标识标识,则更直接的方法是使用HTML文档对象的getElementById方法。
尝试并替换此部分
Set AElements = HTMLDoc.getElementsByTagName("a")
For Each AElement In AElements
If AElement.id = "ctl00_PageContent_MultiImage_PreviewImage" Then
Cells(Cell, 27) = AElement.src 'I write URL in the 27th column
End If
Next AElement
类似
set previewImg = HTMLDoc.getElementById("ctl00_PageContent_MultiImage_PreviewImage")
If not previewImg is Nothing then Cells(Cell, 27) = previewImg.getAttribute("src")