使用Saxon XPath避免使用XHTML的名称空间前缀

时间:2016-05-07 19:57:46

标签: xpath namespaces xhtml saxon jaxp

使用Saxon HE 9.6作为JAXP实现

拥有包含XHTML命名空间的HTML文档

//*:title会返回预期值,但//title不会

我真的很想使用//title。怎么办呢?

或者,我可以从已构建的Document中删除命名空间吗?

1 个答案:

答案 0 :(得分:4)

参见https://saxonica.plan.io/boards/3/topics/1649,您可以将从Saxon XPathFactory实现创建的JAXP XPath对象转换为net.sf.saxon.xpath.XPathEvaluator,然后为XPath评估设置默认的XPath命名空间,例如:

((XPathEvaluator)xpath).getStaticContext().setDefaultElementNamespace("http://www.w3.org/1999/xhtml");

然后路径//title将在XHTML命名空间中选择title个元素。我测试了这个样本

    XPathFactory xpathFactory = new XPathFactoryImpl();
    XPath xpath = xpathFactory.newXPath();
    ((XPathEvaluator)xpath).getStaticContext().setDefaultElementNamespace("http://www.w3.org/1999/xhtml");


    String xhtmlSample = "<html xmlns='http://www.w3.org/1999/xhtml'><head><title>This is a test</title></head><body><h1>Test</h1></body></html>";
    InputSource source = new InputSource(new StringReader(xhtmlSample));

    System.out.println("Found: " + xpath.evaluate("//title", source));