如何在c#中使用xpath从html中提取span值?

时间:2017-10-25 12:25:24

标签: c# html xpath

  1. 我遇到了一个问题,我只是想从x-path属性中提取数据 选择器这是内部文本要显示的div和span请帮助我 出。
  2. 我也在每次循环中执行此选项16次。

    <div class="l">
    <span id="ls_title_7596012" class="ls_h_desc" title="Required 10 marla 
    old house in any block of bahria town">Required 10 marla old house in 
    any block of bahria town</span>
    </div>
    
  3. 我也在尝试,但没有成功。

     var name=htmlDocument?.DocumentNode?.SelectNodes("//div[@class=\"1\"]//span[@class=\"ls_h_desc\"]//title")[0].InnerText;
    

2 个答案:

答案 0 :(得分:0)

我更喜欢使用带有css选择器的HtmlAgilityPack。有一个包做这个工作https://github.com/hcesar/HtmlAgilityPack.CssSelector

[TestFixture]
public class TestClass
{
    [Test]
    public void TestMethod()
    {


    var html = @"<div class=""l"">
<span id=""ls_title_7596012"" class=""ls_h_desc"" title=""Required 10 marla 
old house in any block of bahria town"">Required 10 marla old house in 
any block of bahria town</span>
</div>";

        // Try HtmlAgilityPack with css selector
        var doc = new HtmlAgilityPack.HtmlDocument();
        doc.LoadHtml(html);

        IList<HtmlNode> nodes = doc.QuerySelectorAll("div.l #ls_title_7596012");
        Assert.IsNotEmpty(nodes);
        Assert.AreEqual(nodes.First().InnerText, "Required 10 marla old house in \r\nany block of bahria town");


        // try with xpath
        var xpath = @"//*[@id=""ls_title_7596012""]";
        nodes = doc.DocumentNode.SelectNodes(xpath);
        Assert.IsNotEmpty(nodes);
        Assert.AreEqual(nodes.First().InnerText, "Required 10 marla old house in \r\nany block of bahria town");
    }
}

答案 1 :(得分:0)

试试这个,如果你需要获得&#34; title&#34;属性值:

var title = htmlDocument.DocumentNode.SelectSingleNode("//*[@id = 'ls_title_7596012]'").getAttribute("title").value

var title = htmlDocument.DocumentNode.SelectSingleNode("//*[@id = 'ls_title_7596012]/@title'")

如果您需要获取内部文本,可以尝试:

var text = htmlDocument.DocumentNode.SelectSingleNode("//*[@id = 'ls_title_7596012]'").InnerText

var text = htmlDocument.DocumentNode.SelectSingleNode("//*[@id = 'ls_title_7596012]/text()'").InnerText