Question

我有一个xml文件，其中包含以下内容：

<p>
    <r>
        <t xml:space="preserve">Reading is easier, </t>
    </r>
    <r>
        <fldChar fldCharType="begin"/>
    </r>
    <r>
        <instrText xml:space="preserve"> REF _Ref516568558 \r \p \h </instrText>
    </r>
    <r>
        <fldChar fldCharType="separate"/>
    </r>
    <r>
        <t>This is all the text I want to capture</t>
    </r>
    <r>
        <fldChar fldCharType="end"/>
    </r>
    <r>
        <t xml:space="preserve">, in the new Reading view </t>
    </r>
    <r>
        <fldChar fldCharType="begin"/>
    </r>
    <r>
        <instrText xml:space="preserve"> REF _Not516755367 \r \h </instrText>
    </r>
    <r>
        <fldChar fldCharType="separate"/>
    </r>
    <r>
        <t>But not this...</t>
    </r>
    <r>
        <fldChar fldCharType="end"/>
    </r>
    <r>
        <t xml:space="preserve"> Some other text... </t>
    </r>
</p>

我知道我可以使用XPath表达式//instrText[contains(text(), '_Ref')]来获取<instrText xml:space="preserve"> REF _Ref516568558 \r \p \h </instrText>。

现在我想得到的是t和<fldChar fldCharType="begin"/>之间的<fldChar fldCharType="end"/>个节点中的文字，如果这两个标签之间有一个instrText文本包含{'_Ref' 1}}即instrText[contains(text(), '_Ref']。

基于此，从示例xml开始，我只希望返回<t>This is all the text I want to capture</t>。

可以使用单个XPath 1.0表达式完成吗？

Answer 1

试试这个：p/r[preceding-sibling::r[fldChar/@fldCharType='begin'] and following-sibling::r[fldChar/@fldCharType='end']]/t[contains(., '_Ref')]

Answer 2

这就是我最终使用的内容：//p/r[preceding-sibling::r[fldChar/@fldCharType='begin'] and following-sibling::r[fldChar/@fldCharType='end']][instrText[contains(text(), '_Ref')]]/following-sibling::r[t][1]

XPath表达式有条件地获取相邻节点

2 个答案: