Excluding child elements doesn't work

时间:2015-05-04 19:25:45

标签: javascript xml xpath

I've a parsed XML string in JS, data comes from the wikipedia api and looks like this:

<part>
    <name>
        Other names   
    </name>=
    <value> * Some  * other * Names ([[IUPAC]])
        <ext>
            <name>
                ref
            </name>
            <attr/>
            <inner>
                {{SomePaper|3283|Datum=20. November 2014}}
            </inner>
            <close>
               &lt;/ref&gt;
            </close>
        </ext>
        * Last name
    </value>
 </part>

I want to use XPath to just extract all the names = no child nodes of <value>. I parse the XML with

var doc = new DOMParser().parseFromString(xmlString,'text/xml');

and then try to extract with

var result = doc.evaluate("//name[contains(text(), 'Other names')]/following-sibling::value[not(self::ext)]", doc, null, XPathResult.STRING_TYPE, null);

Yet the output is something like * Some * other * Names ([[IUPAC]])ref{{SomePaper|3283|Datum=20. November 2014}}</ref> * Last name

One thing that kind of works is

var result = doc.evaluate("//name[contains(text(), 'Other names')]/following-sibling::value[not(self::ext)]/text.()", doc, null, XPathResult.STRING_TYPE, null);

But then I'm loosing everything that comes after the </ext> = "* Last name" is missing (the reason for that is explained here I think).

What am I doing wrong here?

Update

Here's a fiddle: http://jsfiddle.net/v03xqoq4/1/

My desired output:

*Some *other *Names ([[IUPAC]]) * Last name

3 个答案:

答案 0 :(得分:1)

也许您需要以下表达式:

//name[contains(text(), 'Other names')]/following-sibling::value[1]/text()

应用于您显示的输入XML,结果是(单个结果由-------分隔):

* Some * other * Names ([[IUPAC]])
-----------------------
* Last name

正如您所看到的,表达式返回两个单独的结果,而您希望结果是一个连接的字符串,而XPath 1.0则无法做到这一点。但我假设您可以使用JS字符串函数来连接结果。

现在,一些可能有用的细节。让我们看一下您的输入XML:

<part>
    <name>
        Other names   
    </name>=
    <value> * Some  * other * Names ([[IUPAC]])
        <ext>
            <!--Irrelevant stuff-->
        </ext>
        * Last name
    </value>
 </part>

您感兴趣的部分是value元素的子文本节点。在XPath中,文本节点用text()标识(与*标识元素节点的方式相同)。你可以通过简单评估来获得它们

//value/text()

但您的问题意味着可能有多个value元素,并且选择与name元素之前的value元素有关。

最后,你的小提琴可能有些不对劲。即使doc.evaluate("//*", doc, null, XPathResult.STRING_TYPE, null)也没有返回任何内容。

答案 1 :(得分:0)

如果您真正想要的只是部件名称而没有来自值标签内的数据(&#34; 没有值的子节点。&#34;),只需使用{{1} }

如果这不是您想要的,请解释您希望看到的输出结果。

根据评论编辑以下内容:

好的,我认为你的XPath字符串中只有一段额外的时间。

尝试/part/name[contains(text(), 'Other names')]

答案 2 :(得分:0)

这就是我开始工作的方式:

var iterator = doc.evaluate("//name[contains(text(), 'Andere Namen')]/following-sibling::value[1]/text()", doc, null, XPathResult.ORDERED_NODE_ITERATOR_TYPE, null);

try {
  var thisNode = iterator.iterateNext();

  while (thisNode) {
  console.log( thisNode.textContent );
  thisNode = iterator.iterateNext();
  } 
}

小提琴:http://jsfiddle.net/ryv72mqm/2/

感谢@MathiasMüller让我到那里来!