Question

我对xpath很新，所以寻求一些模式的帮助以匹配以下内容。我目前的尝试并不符合我的预期。

//text()[1][contains(.,'wordToMatch') and not(self::a)]

我确信你可以从上面的模式中看到，我是一个菜鸟。

示例有效负载1：

<p>Sample 1 <a href="shouldNotMatchWrappedInA">wordToMatch</a> some 
random text 
to not be matched followed by wordToMatch, this should work.</p>

预期结果1：

wordToMatch (Not the one inside of a' tags but the following one)

示例有效负载2：

<p>Sample 2 <a href="shouldNotMatchWrappedInA">wordToMatch</a> some 
random text to not be matched followed by <b>wordToMatch</b> this
should work.</p>

预期结果2：

wordToMatch (The one inside of the b' tags)

示例有效负载3：

<p>Sample 3 <a href="shouldNotMatchWrappedInA">wordToMatch</a> some 
random text to not be matched followed by wordToMatch followed by
further occurrences of wordToMatch which should not be matched.</p>

预期结果3：

wordToMatch (The second occurrence of the term)

所有3个有效负载的预期结果是第一次出现 wordToMatch ，其中 NOT 包含在＆＃39; a＆＃39;标签

将实现此模式的最终语言是Java。

请帮忙。

Answer 1

从问题到目前为止，您仍然不清楚，我认为，为每个样本添加精确预期输出会清除。无论如何，根据当前信息，考虑以下XPath，它将匹配内部文本完全等于'wordToMatch'的任何元素，并且元素本身不是//*[.='wordToMatch'][not(self::a)]元素：

这将在第二种情况下返回//*[not(self::a)]/text()[contains(.,'wordToMatch')]元素，而在其他情况下则不返回。如果你想放松匹配，则返回文本节点（而不是父元素），这样做：

for

<强>更新：

在XPath 2.0或更高版本中，您可以使用for $t in //*[not(self::a)]/text()[contains(.,'wordToMatch')] return 'wordToMatch' construct：

HashMap

<强> xpatheval demo

xpath查询使用父标记省略结果

1 个答案: