我正在尝试解析OpenOffice电子表格,以获取第一列中具有唯一值的行。
I.E。,我想从以下XML片段中检索第一个孩子<table:table-row>
中具有唯一<text:p>
值的所有<table:table-cell>
元素。
<table:table table:name="foo">
<table:table-row>
<table:table-cell>
<text:p>1</text:p>
</table:table-cell>
<table:table-cell>
<text:p>foo</text:p>
</table:table-cell>
</table:table-row>
<table:table-row>
<table:table-cell>
<text:p>2</text:p>
</table:table-cell>
<table:table-cell>
<text:p>bar</text:p>
</table:table-cell>
</table:table-row>
<table:table-row>
<table:table-cell>
<text:p>1</text:p>
</table:table-cell>
<table:table-cell>
<text:p>baz</text:p>
</table:table-cell>
</table:table-row>
</table:table>
我想将以下输出作为节点
<table:table-row>
<table:table-cell>
<text:p>1</text:p>
</table:table-cell>
<table:table-cell>
<text:p>foo</text:p>
</table:table-cell>
</table:table-row>
<table:table-row>
<table:table-cell>
<text:p>2</text:p>
</table:table-cell>
<table:table-cell>
<text:p>bar</text:p>
</table:table-cell>
</table:table-row>
如何使用XPath执行此操作?
答案 0 :(得分:0)
Pure XPath应该是:
/table:table/table:*[not(
.//text:p[1]
= preceding-sibling::table:table-row//text:p[1]
)]
如果预期输出是指一系列table:row
个节点,而不是某个xml文档,正如有人在评论中正确注意。
/table:table/table:*[not(
./table:*[1]//text:*[1]
= preceding-sibling::table:*/table:*[1]/text:*[1]
)]
答案 1 :(得分:0)
此XPath产生所需的输出:
/table:table/table:table-row[not(./table:table-cell[1]/text:p/text() = preceding-sibling::table:table-row/table:table-cell[1]/text:p/text())]