Question

<div id="caption">
<div>
    Position: Passenger Side Front
    <br></br>
    Color: Black
    <br></br>
    Finish: Smooth / Paintable
    <br></br>
    Part Brand: LatchWell
    <br></br>
    Lifetime Warranty
</div>

我需要xpath来获取零件品牌：值。我希望OP是
的 LatchWell

这是我的代码：

  tag = htmlDoc.DocumentNode.SelectSingleNode("//div[@id='caption']//div");
            if (tag != null)
            {
                wi.Brand = tag.InnerText.Trim();
            }

我无法使用拆分功能进行拆分，因为Part Brand上方和下方的数据是动态的。

Answer 1

由于除了两个<div>标记之外，您的HtmlAgilityPack无法选择HTML标记，因此您必须使用某种其他方法，例如Regex评估。

假设代码中始终存在Part Brand: something <br><br>，您可以选择Part Brand:和<br>之间的文字并获取品牌名称。

HtmlNode brandNode = doc.DocumentNode.SelectSingleNode("//div[@id='caption']//div");
string brand = Regex.Match(brandNode.InnerHtml, "Part Brand: (.*?)<br>").Groups[1].Value;
Console.WriteLine(brand);

Regex.Match(string, regexp)的简单使用将输出Latchwell。

Answer 2

实际上，您可以使用XPath选择特定的HTML行，例如：

var tag = htmlDoc.DocumentNode
                 .SelectSingleNode("//div[@id='caption']/div/text()[contains(.,'Part Brand:')]");
//given html input as posted in this question, following will print : "LatchWell"
Console.WriteLine(tag.InnerText.Trim().Replace("Part Brand: ", ""));

Xpath：如何从div标签获取数据

2 个答案: