Question

我正在使用VS2010并使用HTMLAGilityPack1.4.6（来自Net40-folder）。以下是我的HTML

<html>

<body>


<div id="header">

<h2 id="hd1">
    Patient Name
</h2>   
</div>
</body>


</html>

我在C＃中使用以下代码来访问“hd1”。请告诉我正确的方法。

HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
try
{
    string filePath = "E:\\file1.htm";
    htmlDoc.LoadHtml(filePath);

    if (htmlDoc.DocumentNode != null)
    { 

        HtmlNodeCollection _hdPatient = htmlDoc.DocumentNode.SelectNodes("//h2[@id=hd1]");
        // htmlDoc.DocumentNode.SelectNodes("//h2[@id='hd1']");  
        //_hdPatient.InnerHtml = "Patient SurName";
    }
}
catch (Exception ex)
{
    throw ex;
}

尝试了许多排列组合......我得到了空。

帮助。

Answer 1

您的问题是将数据加载到HtmlDocument的方式。要从文件加载数据，您应该使用Load(fileName)方法。但您使用的是LoadHtml(htmlString)方法，该方法将"E:\\file1.htm"视为文档内容。当HtmlAgilityPack尝试在h2字符串中查找E:\\file1.htm标记时，它什么也找不到。这是加载html文件的正确方法：

string filePath = "E:\\file1.htm";
htmlDoc.Load(filePath); // use instead of LoadHtml

@Simon Mourier也正确地指出，如果你想获得单个节点，你应该使用SelectSingleNode方法：

// Single HtmlNode
var patient = doc.DocumentNode.SelectSingleNode("//h2[@id='hd1'");
patient.InnerHtml = "Patient SurName";

或者，如果您正在处理节点集合，请在循环中处理它们：

// Collection of nodes
var patients = doc.DocumentNode.SelectNodes("//div[@class='patient'");
foreach (var patient in patients)
    patient.SetAttributeValue("style", "visibility: hidden");

Answer 2

你快到了那里：

HtmlNode _hdPatient = htmlDoc.DocumentNode.SelectSingleNode("//h2[@id='hd1']");
_hdPatient.InnerHtml = "Patient SurName"

使用HtmlAgilityPack选择Node不起作用

2 个答案: