DocumentBuilder给出了java.net.MalformedURLException:no!/ in spec

时间:2016-07-20 18:46:09

标签: java xml-parsing

我的资源文件夹中有一个XML文件。这就是我一直在尝试的:

首先从资源文件夹中获取文件:

ClassLoader classLoader = ParseXML.class.getClassLoader();
File file = new File(classLoader.getResource("sample.xml").getFile());

然后使用DOM解析来读取文件:

    DocumentBuilder dBuilder = null;
    Document doc =null;             
    DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
    dBuilder = dbFactory.newDocumentBuilder();
    doc = dBuilder.parse(file)

    doc.getDocumentElement().normalize();
    System.out.println("Root element :" + doc.getDocumentElement().getNodeName());

它一直在给我" java.net.MalformedURLException:no!/ in spec"。我究竟做错了什么?

我也试过这样做:

fileAsString = IOUtils.toString(classLoader.getResourceAsStream("sample.xml"));
doc = dBuilder.parse(new InputSource(new ByteArrayInputStream(fileAsString.getBytes("utf-8"))));

但错误保持不变。任何帮助将不胜感激。谢谢。

根据要求,包括堆栈跟踪:

java.net.MalformedURLException: no !/ in spec
    at java.net.URL.<init>(Unknown Source)
    at java.net.URL.<init>(Unknown Source)
    at java.net.URL.<init>(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
    at com.northwestern.XMLParse.ParseXML.main(ParseXML.java:47)
Caused by: java.lang.NullPointerException: no !/ in spec
    at sun.net.www.protocol.jar.Handler.parseAbsoluteSpec(Unknown Source)
    at sun.net.www.protocol.jar.Handler.parseURL(Unknown Source)
    ... 18 more

1 个答案:

答案 0 :(得分:0)

从您的评论中,XML文件包含以下行:

<!DOCTYPE Policies PUBLIC "-//OpenSSO Policy Administration DTD//EN" "jar://com/sun/identity/policy/policyAdmin.dtd"> 

发生异常是因为jar:URL始终采用jar: jar-url !/ jar-entry-path 的形式,因此“ jar://com/sun/identity/policy/policyAdmin.dtd“不是有效的URL。有效URL的示例如下:jar:http://www.example.com/lib/dtds.jar!/com/sun/identity/policy/policyAdmin.dtd

当然,理想的解决方案是修复XML文件,或者告诉其作者修复它。但听起来你没有这个选择。

我要尝试的第一件事是在创建DocumentBuilder之前,通过在代码中添加dbFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);来禁止取消引用系统ID。但是,正如你的评论所说,这似乎不起作用。

接下来我会尝试dbFactory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");,但我怀疑实现会尝试将DOCTYPE的系统ID转换为URL,这意味着setAttribute也无济于事。

这可能是某人误导将类路径资源指定为URL(如果不知道包含.jar文件的位置就无法完成)。您可以通过设置EntityResolver

来解决他们的错误
dBuilder.setEntityResolver(new EntityResolver() {
    @Override
    public InputSource resolveEntity(String publicID,
                                     String systemID)
    throws SAXException,
           IOException {

        if (systemID.startsWith("jar:") && !systemID.contains("!/")) {
            String path = systemID.replaceFirst("^jar:/*", "/");
            return new InputSource(ParseXML.class.getResourceAsStream(path));
        }
        return null;
    }
});