Question

<cpc:note type="note">
<cpc:note-paragraph>This subclass 
    <cpc:u>covers</cpc:u>: 
    <cpc:subnote type="bullet">
        <cpc:note-paragraph>equipment for the care, culture or rearing of all animals, or for obtaining their products, unless provided for elsewhere,e.g. milking 
            <cpc:class-ref scheme="cpc">A01J</cpc:class-ref>, shoeing animals 
            <cpc:class-ref scheme="cpc">A01L</cpc:class-ref>, veterinary devices 
            <cpc:class-ref scheme="cpc">A61D</cpc:class-ref>, culture of animal cells 
            <cpc:class-ref scheme="cpc">C12M</cpc:class-ref>, 
            <cpc:class-ref scheme="cpc">C12N</cpc:class-ref>;
        </cpc:note-paragraph>
        <cpc:note-paragraph>methods of breeding animals or new animal breeds.</cpc:note-paragraph>
    </cpc:subnote>
</cpc:note-paragraph>

解析此XML节点确实很困难。我正在正确检索特定的音符元素，但不确定如何解析“内部”节点。这是来自EPO OPS api，它们似乎在其代码中嵌入了显示逻辑。因此人们可以将其视为html with。

我很好奇如何暂时以纯文本形式返回上述内容，忽略那些使用XPATH的子元素

当前尝试使用./cpc:note-paragraph/text()

Answer 1

我不确定如何使用Elixir，但你需要检查两件事：

命名空间。您必须声明cpc的名称空间：在表达式中使用的某个位置。在.NET中，我们使用XmlNamespaceManager，我们将其传递给XPath评估方法。
检查你的表达式是否实际上一直到cpc：note-paragraph元素。

在C＃中我可以写这样的内容来查找段落下的所有文本节点。

var x = XElement.Load(reader);
var xmn = new XmlNamespaceManager(reader.NameTable);
xmn.AddNamespace("cpc", "http://cpc-namespace");
var nodes = x.XPathEvaluate("//cpc:note-paragraph//text()", xmn);

使用XPATH

1 个答案: