Question

我试图解析主要内容（dom树中的最后一个）

<table>

在本网站：＆＃34; https://aips.um.si/PredmetiBP5/Main.asp?Mode=prg&Zavod=77&Jezik=&Nac=1&Nivo=P&Prg=1571&Let=1＆＃34; 我正在使用Htmlagilitypack并在Visual Studio 17中的wpf应用程序中用C＃编写代码。

现在我正在使用此代码：

iso = Encoding.GetEncoding("windows-1250");
web = new HtmlWeb()
{
    AutoDetectEncoding = false,
    OverrideEncoding = iso,
};
//http = https://aips.um.si/PredmetiBP5/Main.asp?Mode=prg&Zavod=77&Jezik=&Nac=1&Nivo=P&Prg=1571&Let=1
string http = formatLetnikLink(l.Attributes["onclick"].Value).ToString();           
var htmlProgDoc = web.Load(http);
string s = htmlProgDoc.ParsedText;

htmlprogDoc.ParsedText正确包含所有行这应该是在最后一个表中（我有这个用于调试，只是因为观察窗口坏了或者什么......等等......）

我试图首先获取网站上表格中的所有表格。并意识到有6个

<table></table>

标签上，即使你只看到一个。调试几个小时后，我意识到最后一个主表是最后一个

<table>

在dom树中

，并且解析器完全解析所有

<tr>

表中的

标签。这是问题所在，我需要所有tr标签。

var tables = htmlProgDoc.DocumentNode.SelectNodes("//table");

有6次

<table></table>

标签，正如预期的那样，它们的每个都被完全解析，包括它们的所有行和列，除了最后一个，在最后一个，它只解析前两行，然后解析器apears附加一个

 </table>

通过它自己，我也尝试使用直接xpath选择器，来自firefox的copy-ed：＆＃34; / html / body / div / center [2] / font / font / font / table＆＃34;，而不是＆＃34; // table＆＃34; 它找到了正确的表，但表中也只包含前两行

var theTableINeed = tables.Last();
//contains the correct table which I need, but with only the first two rows

Htmlagilitypack仅部分解析表行

0 个答案: