我在Vb.Net WinForms应用程序中使用SHDocVw.InternetExplorer API从Internet Explorer获取元素。我可以轻松访问父文档和框架元素中的元素,但我无法访问'embed'容器中的元素。这是示例代码:
Dim ie As SHDocVw.InternetExplorer
ie.Navigate("Some URL")
ie.Visible = True
Dim ieDoc As mshtml.IHTMLDocument2 = ie.Document
'All Elements
Dim allElements = ieDoc.all
'Frames
Dim allFrames = ieDoc.frames
'Fetch each frame and use its document to get all elements
Dim allEmbed = ieDoc.embeds
'How to fetch document inside embed to access its elements?
这是一个示例html:
Sample.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Sample</title>
</head>
<body>
<embed src="test.html" name="test1"/>
</body>
</html>
的test.html
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Sample</title>
</head>
<body bgcolor="#FFFFFF">
<button>Button1</button>
<label>Test 1</label>
</body>
</html>
如何使用'embed'标签访问Sample.html中加载的Test.html内的按钮和标签?
修改1 :
根据我的研究,我可以使用'object'元素的.contentDocument属性访问'object'容器内的文档,但同样不适用于'embed'容器。
我可以在'embed'容器上使用getSVGDocument()属性获取一些comObject但不能将其强制转换为mshtml.IHTMLDocument2
答案 0 :(得分:0)
好吧,我一直在使用&#34; Html Agility Pack&#34;在这里解析HTML,它非常棒, 您可以在页面中获取所有嵌入元素,并读取/解析内部内容。 http://html-agility-pack.net/
我的样本:
'<html xmlns='http://www.w3.org/1999/xhtml'>
'<head>
' <title>Sample</title>
'</head>
'<body>
' <embed src='http://stackoverflow.com/questions/41806246/access-elements-inside-html-embed-tag-source-html-using-vb-net' name='test1'/>
'</body>
'</html>
'The htmlCode string:
Dim htmlCode As String = "<html xmlns='http://www.w3.org/1999/xhtml'><head><title>Sample</title></head><body><embed src='http://stackoverflow.com/questions/41806246/access-elements-inside-html-embed-tag-source-html-using-vb-net' name='test1'/></body></html>";
Dim client As New WebClient()
Dim doc = New HtmlDocument()
doc.LoadHtml(htmlCode)
Dim nodes = doc.DocumentNode.Descendants("embed")
For Each item As var In nodes
Dim srcEmded = item.GetAttributeValue("src", "")
If Not String.IsNullOrWhiteSpace(srcEmded) Then
Dim yourEmbedHtml As String = client.DownloadString(srcEmded)
'Do what you want with yourEmbedHtml
End If
Next