如何使用html agility pack获取c#中html文件的html元素的id?

时间:2014-04-03 05:41:49

标签: c# html-agility-pack

我有类似的东西

<div class='mainclass subclass1' quest-id='123'> </div>
<div class='mainclass subclass2' quest-id='234'> </div>
<input quest-id='3236'> </input>
<textarea quest-id='256'> </textarea>

我希望所有div的quest-d属于所有输入和textarea的名为subclass1subclass2quest-id的类。我怎样才能在c#中使用html agility pack?

我有这样的c#代码:

HtmlDocument document = new HtmlDocument();
document.LoadHtml(obj.NewPage.Content);

HtmlNode htmlRootElement = document.DocumentNode.SelectSingleNode("/html");
HtmlNode bodyElement = htmlRootElement.SelectSingleNode("body");

我不知道如何继续

3 个答案:

答案 0 :(得分:0)

这是我编写和测试的片段。

 const string sampleHTML = @"<div class='mainclass subclass1' quest-id='123'></div>
    <div class='mainclass subclass2' quest-id='234'></div>
    <input quest-id='3236'> </input>
    <textarea quest-id='256'> </textarea>";


 HtmlAgilityPack.HtmlDocument myDoc = new HtmlAgilityPack.HtmlDocument();
 myDoc.LoadHtml(sampleHTML);
 HtmlNodeCollection foundNodes = myDoc.DocumentNode.SelectNodes("/div[contains(@class, 'subclass2')]");                                                                         
 MessageBox.Show(foundNodes[0].Attributes["quest-id"].Value);

当我运行代码段时,我看到了值&#39; 234&#39;在消息框中。

答案 1 :(得分:0)

以下XPath可以从此问题中发布的样本html获取所有quest-id值:

//div[contains(@class, 'subclass1') or contains(@class, 'subclass2')]
| //input[@quest-id] 
| //textarea[@quest-id]

工作示例:

var html = @"<root><div class='mainclass subclass1' quest-id='123'> </div>
<div class='mainclass subclass2' quest-id='234'> </div>
<input quest-id='3236'> </input>
<textarea quest-id='256'> </textarea></root>";
var doc = new HtmlDocument();
doc.LoadHtml(html);

var nodes = 
    doc.DocumentNode
       .SelectNodes("//div[contains(@class, 'subclass1') or contains(@class, 'subclass2')]"
                        + " | //input[@quest-id] "
                        + " | //textarea[@quest-id]");
foreach (var node in nodes)
{
    Console.WriteLine(node.GetAttributeValue("quest-id", ""));
}

答案 2 :(得分:0)

 string id = div.GetAttributeValue("id", "").ToString();
 string name= div.GetAttributeValue("name", "").ToString();