我编写了一个作为linux守护程序的应用程序,它正在解析大约100 MB的XML文件。
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import org.jdom2.input.SAXBuilder;
import org.jdom2.output.Format;
import org.jdom2.output.XMLOutputter;
import org.jdom2.Document;
import org.jdom2.Element;
import org.jdom2.JDOMException;
public class MyReader {
private Document doc;
private Element rootNode;
public MyReader(String Filename) {
try {
doc = (Document) new SAXBuilder().build(new File(Filename));
rootNode = doc.getRootElement();
} catch ( NullPointerException | NumberFormatException | IOException e) {
e.printStackTrace();
} catch (JDOMException e) {
e.printStackTrace();
}
}
}
我正在使用这种方法处理大约14个文件一周,有时其中一个文件无法生成此堆栈跟踪:
org.jdom2.input.JDOMParseException: Error on line 15410 of document file:/home/files/new/100.xml: XML document structures must start and end within the same entity.
at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:228)
at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:277)
at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:264)
at org.jdom2.input.SAXBuilder.build(SAXBuilder.java:1116)
at com.scoobydoo.files.MyReader.<init>(MyReader.java:38)
at application.Daemon.checkNewImportFiles(Daemon.java:225)
at application.Daemon.startApplication(Daemon.java:68)
at application.Daemon.run(Daemon.java:36)
Caused by: org.xml.sax.SAXParseException; systemId: file:/home/files/new/100.xml; lineNumber: 15410; columnNumber: 35; XML document structures must start and end within the same entity.
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:327)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1437)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.endEntity(XMLDocumentFragmentScannerImpl.java:904)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.endEntity(XMLDocumentScannerImpl.java:563)
at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.endEntity(XMLEntityManager.java:1399)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1811)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipChar(XMLEntityScanner.java:1460)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2824)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:118)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:217)
... 8 more
奇怪的是,如果我第二次尝试导入文件,则导入时没有任何错误。
当然,我已经检查过该文件,并使用xmllint对其进行了验证,并且没有报告任何问题。
我的猜测是SAXBuilder()。build()打开一个文件的InputStream,由于某种原因被截断,你知道如何检查这个或任何其他可能导致此问题的问题吗?
先感谢大家!
嗨,大家好, 我今天早上有一种照明,正在处理文件,然后是用户上传。我的猜测是,当该文件尚未完全上传时,该过程开始读取该文件,因此不是文件完成就失败了。
这与问题的匹配以及之后尝试读取文件的事实已经完成,因为在“人工检查”时刻上传已完成。
我做了一个更改以验证文件是否在尝试处理之前已完全上传(不知道为什么我之前没有这样做),如果问题再次出现,会告诉您。
谢谢!