我使用默认的WebBrowser控件在C#中编写了一个非常小的网站bot。实际上几乎所有东西都按照预期的方式工作,但我的自动化的最后一步似乎有问题。
网站是使用多个iframe构建的。这不是什么大问题,因为我只是使用
访问这些框架及其元素webBrowser1.Document.Window.Frames[0].Document.GetElementById("element").InvokeMember("click");
当IFRAME的来源托管在与实际网站不同的域上时, 无效。当我在互联网上搜索我的问题的答案时,我偶然发现了一篇提到这个特定问题的MSDN文章,他们指的是跨站点脚本的安全措施,这可能是导致此错误的原因。
我真的找不到禁用此功能的方法所以我继续前进并决定重新编码所有内容以使用geckofx-12而不是默认(基于IE)的Web浏览器控件,但我遇到了类似的问题..
我的问题是:有什么方法可以绕过这种烦人的行为吗?我真的不关心安全性问题,也不关心是否使用geckofx或默认的Web浏览器控件,我只想以编程方式访问托管在不同域上的网站的元素,而不是遇到UnauthorizedAccessException。
我很乐意从那里的大师那里得到建议。
答案 0 :(得分:7)
您无法访问来自不同域的帧。这是一个安全功能。有一点破解:
public class CrossFrameIE
{
// Returns null in case of failure.
public static IHTMLDocument2 GetDocumentFromWindow(IHTMLWindow2 htmlWindow)
{
if (htmlWindow == null)
{
return null;
}
// First try the usual way to get the document.
try
{
IHTMLDocument2 doc = htmlWindow.document;
return doc;
}
catch (COMException comEx)
{
// I think COMException won't be ever fired but just to be sure ...
if (comEx.ErrorCode != E_ACCESSDENIED)
{
return null;
}
}
catch (System.UnauthorizedAccessException)
{
}
catch
{
// Any other error.
return null;
}
// At this point the error was E_ACCESSDENIED because the frame contains a document from another domain.
// IE tries to prevent a cross frame scripting security issue.
try
{
// Convert IHTMLWindow2 to IWebBrowser2 using IServiceProvider.
IServiceProvider sp = (IServiceProvider)htmlWindow;
// Use IServiceProvider.QueryService to get IWebBrowser2 object.
Object brws = null;
sp.QueryService(ref IID_IWebBrowserApp, ref IID_IWebBrowser2, out brws);
// Get the document from IWebBrowser2.
IWebBrowser2 browser = (IWebBrowser2)(brws);
return (IHTMLDocument2)browser.Document;
}
catch
{
}
return null;
}
private const int E_ACCESSDENIED = unchecked((int)0x80070005L);
private static Guid IID_IWebBrowserApp = new Guid("0002DF05-0000-0000-C000-000000000046");
private static Guid IID_IWebBrowser2 = new Guid("D30C1661-CDAF-11D0-8A3E-00C04FC9E26E");
}
// This is the COM IServiceProvider interface, not System.IServiceProvider .Net interface!
[ComImport(), ComVisible(true), Guid("6D5140C1-7436-11CE-8034-00AA006009FA"),
InterfaceTypeAttribute(ComInterfaceType.InterfaceIsIUnknown)]
public interface IServiceProvider
{
[return: MarshalAs(UnmanagedType.I4)]
[PreserveSig]
int QueryService(ref Guid guidService, ref Guid riid, [MarshalAs(UnmanagedType.Interface)] out object ppvObject);
}
答案 1 :(得分:3)
我更新了Daniel Bogdan稍微发布的使用扩展方法的hack,并为您提供了一种调用它的方法,而无需进入mshtml命名空间:
using mshtml;
using SHDocVw;
using System;
using System.Reflection;
using System.Runtime.InteropServices;
using System.Windows.Forms;
namespace TradeAutomation
{
public static class CrossFrameIE
{
private static FieldInfo ShimManager = typeof(HtmlWindow).GetField("shimManager", BindingFlags.NonPublic | BindingFlags.Instance);
private static ConstructorInfo HtmlDocumentCtor = typeof(HtmlDocument).GetConstructors(BindingFlags.NonPublic | BindingFlags.Instance)[0];
public static HtmlDocument GetDocument(this HtmlWindow window)
{
var rawDocument = (window.DomWindow as IHTMLWindow2).GetDocumentFromWindow();
var shimManager = ShimManager.GetValue(window);
var htmlDocument = HtmlDocumentCtor
.Invoke(new[] { shimManager, rawDocument }) as HtmlDocument;
return htmlDocument;
}
// Returns null in case of failure.
public static IHTMLDocument2 GetDocumentFromWindow(this IHTMLWindow2 htmlWindow)
{
if (htmlWindow == null)
{
return null;
}
// First try the usual way to get the document.
try
{
IHTMLDocument2 doc = htmlWindow.document;
return doc;
}
catch (COMException comEx)
{
// I think COMException won't be ever fired but just to be sure ...
if (comEx.ErrorCode != E_ACCESSDENIED)
{
return null;
}
}
catch (System.UnauthorizedAccessException)
{
}
catch
{
// Any other error.
return null;
}
// At this point the error was E_ACCESSDENIED because the frame contains a document from another domain.
// IE tries to prevent a cross frame scripting security issue.
try
{
// Convert IHTMLWindow2 to IWebBrowser2 using IServiceProvider.
IServiceProvider sp = (IServiceProvider)htmlWindow;
// Use IServiceProvider.QueryService to get IWebBrowser2 object.
Object brws = null;
sp.QueryService(ref IID_IWebBrowserApp, ref IID_IWebBrowser2, out brws);
// Get the document from IWebBrowser2.
IWebBrowser2 browser = (IWebBrowser2)(brws);
return (IHTMLDocument2)browser.Document;
}
catch
{
}
return null;
}
private const int E_ACCESSDENIED = unchecked((int)0x80070005L);
private static Guid IID_IWebBrowserApp = new Guid("0002DF05-0000-0000-C000-000000000046");
private static Guid IID_IWebBrowser2 = new Guid("D30C1661-CDAF-11D0-8A3E-00C04FC9E26E");
}
// This is the COM IServiceProvider interface, not System.IServiceProvider .Net interface!
[ComImport(), ComVisible(true), Guid("6D5140C1-7436-11CE-8034-00AA006009FA"),
InterfaceTypeAttribute(ComInterfaceType.InterfaceIsIUnknown)]
public interface IServiceProvider
{
[return: MarshalAs(UnmanagedType.I4)]
[PreserveSig]
int QueryService(ref Guid guidService, ref Guid riid, [MarshalAs(UnmanagedType.Interface)] out object ppvObject);
}
}
用法:
webBrowser1.Document.Window.Frames["main"].GetDocument();
正如我在上面的评论中所提到的,您还需要添加对SHDocVw的引用。您可以在此处找到相关说明:Add reference 'SHDocVw' in C# project using Visual C# 2010 Express
答案 2 :(得分:1)
我没试过这个,但changing the document domain显然有效。
使用geckofx 12看起来这可能是由nsIDOMHTMLDocument.SetDomainAttribute完成的(GeckoDocument.Domain没有setter但你可以轻松添加它)
IE
。如果您更改文档的域以匹配子框架,则可以访问它。