我尝试运行以下代码:
public void Init(Url rootUrl)
{
var web = new HtmlWeb();
this.doc = web.Load(rootUrl.Value);
}
使用以下参数:
{<System.Security.Policy.Url version="1">
<Url>http://localhost:85/HCM/HCM.html</Url>
</System.Security.Policy.Url>
}
并获得以下异常:
Object reference not set to an instance of an object.
堆栈追踪:
at HtmlAgilityPack.HtmlDocument.ReadDocumentEncoding(HtmlNode node) in C:\Source\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs:line 1916
at HtmlAgilityPack.HtmlDocument.PushNodeEnd(Int32 index, Boolean close) in C:\Source\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs:line 1805
at HtmlAgilityPack.HtmlDocument.Parse() in C:\Source\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs:line 1492
at HtmlAgilityPack.HtmlDocument.Load(TextReader reader) in C:\Source\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs:line 769
at HtmlAgilityPack.HtmlDocument.Load(Stream stream, Boolean detectEncodingFromByteOrderMarks) in C:\Source\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlDocument.cs:line 597
at HtmlAgilityPack.HtmlWeb.Get(Uri uri, String method, String path, HtmlDocument doc, IWebProxy proxy, ICredentials creds) in C:\Source\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlWeb.cs:line 1515
at HtmlAgilityPack.HtmlWeb.LoadUrl(Uri uri, String method, WebProxy proxy, NetworkCredential creds) in C:\Source\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlWeb.cs:line 1563
at HtmlAgilityPack.HtmlWeb.Load(String url, String method) in C:\Source\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlWeb.cs:line 1152
at HtmlAgilityPack.HtmlWeb.Load(String url) in C:\Source\htmlagilitypack\Trunk\HtmlAgilityPack\HtmlWeb.cs:line 1107
at Conduit.CPServices.Logic.HtmlContentMonitor.HtmlAgilityPackHtmlProvider.Init(Url rootUrl) in D:\Conduit\RnD\Server\Services\CP\CPServices\Logic\HtmlContentMonitor\Conduit.CPServices.Logic.HtmlContentMonitor\HtmlAgilityPackHtmlProvider.cs:line 22
at Conduit.CPServices.Logic.HtmlContentMonitor.HtmlContentManager.FetchRootAndExternlContentAsByteArray(Url rootUrl) in D:\Conduit\RnD\Server\Services\CP\CPServices\Logic\HtmlContentMonitor\Conduit.CPServices.Logic.HtmlContentMonitor\HtmlContentManager.cs:line 112
答案 0 :(得分:1)
这是HtmlAgilityPack中的一个错误,可以捕获,例如如果通过<META>
标记设置的文档编码无效(例如<META http-equiv="Content-Type" content="text/html; charset=8859-9">
)。正如Simon Mourier said,这是1.4.0.0中引入的错误。
为避免这种情况,请手动设置编码,例如:
web.Load(rootUrl.Value, Encoding.GetEncoding("iso-8859-9"));
答案 1 :(得分:0)
这可能是HtmlAgilityPack中的一个错误,可能是由于文档中包含HTML。
您可以发布由HtmlAgilityPack解析的HTML吗?