<cpc:note type="note">
<cpc:note-paragraph>This subclass
<cpc:u>covers</cpc:u>:
<cpc:subnote type="bullet">
<cpc:note-paragraph>equipment for the care, culture or rearing of all animals, or for obtaining their products, unless provided for elsewhere,e.g. milking
<cpc:class-ref scheme="cpc">A01J</cpc:class-ref>, shoeing animals
<cpc:class-ref scheme="cpc">A01L</cpc:class-ref>, veterinary devices
<cpc:class-ref scheme="cpc">A61D</cpc:class-ref>, culture of animal cells
<cpc:class-ref scheme="cpc">C12M</cpc:class-ref>,
<cpc:class-ref scheme="cpc">C12N</cpc:class-ref>;
</cpc:note-paragraph>
<cpc:note-paragraph>methods of breeding animals or new animal breeds.</cpc:note-paragraph>
</cpc:subnote>
</cpc:note-paragraph>
解析此XML节点确实很困难。 我正在正确检索特定的音符元素,但不确定如何解析“内部”节点。这是来自EPO OPS api,它们似乎在其代码中嵌入了显示逻辑。因此人们可以将其视为html with。
我很好奇如何暂时以纯文本形式返回上述内容,忽略那些使用XPATH的子元素
当前尝试使用./cpc:note-paragraph/text()
答案 0 :(得分:0)
我不确定如何使用Elixir,但你需要检查两件事:
在C#中我可以写这样的内容来查找段落下的所有文本节点。
var x = XElement.Load(reader);
var xmn = new XmlNamespaceManager(reader.NameTable);
xmn.AddNamespace("cpc", "http://cpc-namespace");
var nodes = x.XPathEvaluate("//cpc:note-paragraph//text()", xmn);