修改
嗨,大家好,在一个文件中我认为我发现错误,排成一行: ПÑÑ,ницd°!
这似乎已经在ansi而不是UTF-8中编纂。
如果我找到类似的东西,我会检查其他文件。
我在DocumentBuilderFactory解析器中遇到一个奇怪的错误,错误如下:
[Fatal Error] standard_000000_3.xml:1221888:48: The element type "tduid" must be terminated by the matching end-tag "</tduid>".
org.xml.sax.SAXParseException; systemId: file:/home/000000/new/standard_000000_3.xml; lineNumber: 1221888; columnNumber: 48; The element type "tduid" must be terminated by the matching end-tag "</tduid>".
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:205)
at com.company.batch.BatchReader.<init>(BatchReader.java:46)
at com.company.batch.BatchFile.open(BatchFile.java:76)
at application.Daemon.checkNewImportFiles(Daemon.java:385)
at application.Daemon.startApplication(Daemon.java:68)
at application.Daemon.run(Daemon.java:36)
[Fatal Error] standard_000000_9.xml:1049516:32: XML document structures must start and end within the same entity.
org.xml.sax.SAXParseException; systemId: file:/home/000000/new/standard_XXXXXX_9.xml; lineNumber: 1049516; columnNumber: 32; XML document structures must start and end within the same entity.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:205)
at com.company.batch.BatchReader.<init>(BatchReader.java:46)
at com.company.batch.BatchFile.open(BatchFile.java:76)
at application.Daemon.checkNewImportFiles(Daemon.java:385)
at application.Daemon.startApplication(Daemon.java:68)
at application.Daemon.run(Daemon.java:36)
我知道这些通常是xml结构的错误,但不是这种情况,我试图用xmllist和奇怪的东西验证它,如果我创建一个只有管理文件的类的java应用程序并在主要插入只打开它的调用,更改2-3属性并保存,它正在工作,并且没有发生错误。
我认为这可能是内存问题所以我试图运行监控所用内存量的进程,现在系统上最大java内存为900mb,程序不需要超过400个。
发生错误的xml文件示例(第一个错误,错误发生在起始标记&lt; onPurchase&gt; ):
<transaction>
<eventId>123456</eventId>
<orderNumber>TEST_ORDER</orderNumber>
<orderValue>0</orderValue>
<currency>USD</currency>
<tduid>testtesttesttesttesttesttesttest</tduid>
<timestamp>2016-03-05 15:23:00 GMT</timestamp>
<extraReportingInfo>
<isUniveralStoreNewPurchaser>True</isUniveralStoreNewPurchaser>
<onEntry>
<productType>Test product</productType>
<tuner>TEST TUNER</tuner>
<userOs>Linux[2.0.10340.184]</userOs>
<userDevice>Linux.Ubuntu</userDevice>
</onEntry>
<onPurchase>
<productType>Test Product</productType>
<tuner>TEST TUNER</tuner>
<contentOwnership>TST</contentOwnership>
<userDevice>Unknown</userDevice>
</onPurchase>
</extraReportingInfo>
<reportInfo>
<item>
<productNumber>PRODUCT_NUMBER</productNumber>
<productName>PRODUCT_NAME</productName>
<price>0</price>
<quantity>1</quantity>
</item>
</reportInfo>
</transaction>
遵循管理加载xml文件的代码:
public BatchReader(String Filename) {
try {
this.filename = Filename;
File XMLFile = new File(Filename);
DocumentBuilderFactory DBFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder DBBuilder = DBFactory.newDocumentBuilder();
DBFactory.setValidating(true);
this.doc = DBBuilder.parse(XMLFile);
// http://stackoverflow.com/questions/13786607/normalization-in-dom-parsing-with-java-how-does-it-work
this.doc.getDocumentElement().normalize();
// Added for debug
System.out.println(XMLFile.getAbsolutePath());
// Setting the batch type
this.Type = "standard";
this.organizationId = Integer.parseInt(this.getString("organizationId", this.doc.getDocumentElement()));
this.Sequence = (this.getString("sequenceNumber", this.doc.getDocumentElement()) != null) ? Integer.parseInt(this.getString("sequenceNumber", this.doc.getDocumentElement())) : 0;
this.checksum = (this.getString("checksum", this.doc.getDocumentElement()) != null) ? this.getString("checksum", this.doc.getDocumentElement()) : null;
//this.checksum = this.getString("checksum", this.doc.getDocumentElement());
this.txAmount = this.doc.getElementsByTagName("transaction").getLength();
} catch ( NullPointerException | NumberFormatException e) {
e.printStackTrace();
} catch (ParserConfigurationException | SAXException | IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
错误发生在这一行:
this.doc = DBBuilder.parse(XMLFile);
文件为73 Mb
我真诚地不知道在哪里寻找问题,
你能帮帮我吗?