使用Java解析XML文件并在文件路径中使用空格

时间:2009-07-15 15:20:27

标签: java xml spaces filepath

我的文件系统上有文件,在Windows XP上。我想用Java解析它们(JRE 1.6)。

问题是,当文件路径中有空格时,我不明白Java和Xerces是如何协同工作的。

如果文件的路径中没有空格,则一切正常。

如果有空格,我可能会遇到这种麻烦,即使我用FileInputStream实例调用解析器

java.net.UnknownHostException: .
    at java.net.PlainSocketImpl.connect(Unknown Source)
    at java.net.Socket.connect(Unknown Source)
    at java.net.Socket.connect(Unknown Source)
    at sun.net.NetworkClient.doConnect(Unknown Source)
    at sun.net.NetworkClient.openServer(Unknown Source)
    at sun.net.ftp.FtpClient.openServer(Unknown Source)
    at sun.net.ftp.FtpClient.openServer(Unknown Source)
    at sun.net.www.protocol.ftp.FtpURLConnection.connect(Unknown Source)
    at sun.net.www.protocol.ftp.FtpURLConnection.getInputStream(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
    at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)

sun.net.ftp.FtpClient.openServer ??? Wtf?)

或者这种麻烦:

java.net.MalformedURLException: unknown protocol: d
    at java.net.URL.<init>(Unknown Source)
    at java.net.URL.<init>(Unknown Source)
    at java.net.URL.<init>(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)

(它说unknown protocol: d因为,我猜,该文件位于D盘上。)

有没有人知道为什么会发生这种情况,以及如何规避问题?我试图提供我自己的EntityResolver但我的日志告诉我它甚至在崩溃之前都没有被调用。


修改

以下是调用解析器的代码。

public Document fileToDom(File file) throws ProcessException {
    Document doc = null;
    try {
        DocumentBuilderFactory db = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = db.newDocumentBuilder();
        if (this.errorHandler!=null){
            builder.setErrorHandler(this.errorHandler);}
        else {
            builder.setErrorHandler(new DefaultHandler());
        }
        FileInputStream test= new FileInputStream(file);
        doc = builder.parse(test);
        ...
    } catch (Exception e) {...}
    ...
}

目前我发现自己被迫在解析之前移除了DOCTYPE,这消除了所有问题,并且DTD验证......不是那么好的解决方案。

4 个答案:

答案 0 :(得分:2)

您刚刚使用DocumentBuilder.parse(filename)吗?

如果是这样,那就失败了,因为它需要一个URI。打开文件的FileInputStream,然后将其传递给DocumentBuilder.parse(InputStream)

答案 1 :(得分:1)

尝试此URI样式:

  

文件:/// d:/folder/folder%20with%20space/file.xml

答案 2 :(得分:1)

看起来它正在尝试连接到doctype标头中的URL,因此可以下载它以便根据下载的DTD验证文档。

答案 3 :(得分:0)

试试这个。

InputSource is = new InputSource();
is.setCharacterStream(new StringReader(test));
doc = builder.parse(is);

而不只是解析'test'