Question

为了基于关键字链接对列表自动用链接替换关键字，我需要在段落（p）和列表项（li）内获取尚未链接的文本，脚本或手动排除的文本– -将在Drupal的Alinks模块中使用。

我对现有的xpath选择器进行了如下修改，并希望得到它的反馈，如果它有效或可能会得到改进：

//*[p or li]//text()[not(ancestor::a) and not(ancestor::script) and not(ancestor::*[@data-alink-ignore])]

xpath可以与任何html5内容一起使用，也可以与自闭标签（格式不正确的xml）配合使用-这就是模块的设计方式，并且效果很好。

Answer 1

要选择不是p或li元素的后代的a或script元素的文本节点后代，可以使用以下XPath 1.0：

//*[self::p|self::li]
   //text()[
      not(ancestor::a|ancestor::script|ancestor::*[@data-alink-ignore])
   ]

Answer 2

您的XPath表达式无效。您在/之前缺少text()。因此有效的表达式应该是

//*[p or li]/text()[not(ancestor::a) and not(ancestor::script) and not(ancestor::*[@data-alink-ignore])]

但是如果没有XML源文件，就无法确定该表达式是否与您想要的节点匹配。