SAXParseException - 元素必须由匹配的结束标记终止(XML文件有效)

时间:2016-03-07 09:58:02

标签: java xml parsing

修改

嗨,大家好,在一个文件中我认为我发现错误,排成一行: ПÑÑ,ницd°!

这似乎已经在ansi而不是UTF-8中编纂。

如果我找到类似的东西,我会检查其他文件。

我在DocumentBuilderFactory解析器中遇到一个奇怪的错误,错误如下:

[Fatal Error] standard_000000_3.xml:1221888:48: The element type "tduid" must be terminated by the matching end-tag "</tduid>".
org.xml.sax.SAXParseException; systemId: file:/home/000000/new/standard_000000_3.xml; lineNumber: 1221888; columnNumber: 48; The element type "tduid" must be terminated by the matching end-tag "</tduid>".
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)
    at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:205)
    at com.company.batch.BatchReader.<init>(BatchReader.java:46)
    at com.company.batch.BatchFile.open(BatchFile.java:76)
    at application.Daemon.checkNewImportFiles(Daemon.java:385)
    at application.Daemon.startApplication(Daemon.java:68)
    at application.Daemon.run(Daemon.java:36)
[Fatal Error] standard_000000_9.xml:1049516:32: XML document structures must start and end within the same entity.
org.xml.sax.SAXParseException; systemId: file:/home/000000/new/standard_XXXXXX_9.xml; lineNumber: 1049516; columnNumber: 32; XML document structures must start and end within the same entity.
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)
    at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:205)
    at com.company.batch.BatchReader.<init>(BatchReader.java:46)
    at com.company.batch.BatchFile.open(BatchFile.java:76)
    at application.Daemon.checkNewImportFiles(Daemon.java:385)
    at application.Daemon.startApplication(Daemon.java:68)
    at application.Daemon.run(Daemon.java:36)

我知道这些通常是xml结构的错误,但不是这种情况,我试图用xmllist和奇怪的东西验证它,如果我创建一个只有管理文件的类的java应用程序并在主要插入只打开它的调用,更改2-3属性并保存,它正在工作,并且没有发生错误。

我认为这可能是内存问题所以我试图运行监控所用内存量的进程,现在系统上最大java内存为900mb,程序不需要超过400个。

发生错误的xml文件示例(第一个错误,错误发生在起始标记&lt; onPurchase&gt; ):

<transaction>
  <eventId>123456</eventId>
  <orderNumber>TEST_ORDER</orderNumber>
  <orderValue>0</orderValue>
  <currency>USD</currency>
  <tduid>testtesttesttesttesttesttesttest</tduid>
  <timestamp>2016-03-05 15:23:00 GMT</timestamp>
  <extraReportingInfo>
    <isUniveralStoreNewPurchaser>True</isUniveralStoreNewPurchaser>
    <onEntry>
      <productType>Test product</productType>
      <tuner>TEST TUNER</tuner>
      <userOs>Linux[2.0.10340.184]</userOs>
      <userDevice>Linux.Ubuntu</userDevice>
    </onEntry>
    <onPurchase>
      <productType>Test Product</productType>
      <tuner>TEST TUNER</tuner>
      <contentOwnership>TST</contentOwnership>
      <userDevice>Unknown</userDevice>
    </onPurchase>
  </extraReportingInfo>
  <reportInfo>
    <item>
      <productNumber>PRODUCT_NUMBER</productNumber>
      <productName>PRODUCT_NAME</productName>
      <price>0</price>
      <quantity>1</quantity>
    </item>
  </reportInfo>
</transaction>

遵循管理加载xml文件的代码:

public BatchReader(String Filename) {
    try {
    this.filename = Filename;
    File XMLFile = new File(Filename);
    DocumentBuilderFactory DBFactory = DocumentBuilderFactory.newInstance();

    DocumentBuilder DBBuilder = DBFactory.newDocumentBuilder();
    DBFactory.setValidating(true);
    this.doc = DBBuilder.parse(XMLFile);

    // http://stackoverflow.com/questions/13786607/normalization-in-dom-parsing-with-java-how-does-it-work
    this.doc.getDocumentElement().normalize();

    // Added for debug
    System.out.println(XMLFile.getAbsolutePath());

    // Setting the batch type
    this.Type = "standard";


    this.organizationId = Integer.parseInt(this.getString("organizationId", this.doc.getDocumentElement()));


    this.Sequence = (this.getString("sequenceNumber", this.doc.getDocumentElement()) != null) ? Integer.parseInt(this.getString("sequenceNumber", this.doc.getDocumentElement())) : 0;
    this.checksum = (this.getString("checksum", this.doc.getDocumentElement()) != null) ? this.getString("checksum", this.doc.getDocumentElement()) : null;

    //this.checksum = this.getString("checksum", this.doc.getDocumentElement());


    this.txAmount = this.doc.getElementsByTagName("transaction").getLength();

    } catch ( NullPointerException | NumberFormatException e) {
        e.printStackTrace();
    } catch (ParserConfigurationException | SAXException | IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
}

错误发生在这一行:

this.doc = DBBuilder.parse(XMLFile);

文件为73 Mb 我真诚地不知道在哪里寻找问题,
你能帮帮我吗?

0 个答案:

没有答案