StAX中的内存不足错误

时间:2011-06-28 06:22:08

标签: java xml stax

我使用以下简单的StAX代码来迭代XML中的所有标记。大小 input.xml> 100 MB

XMLInputFactory xif = XMLInputFactory.newInstance();
        FileInputStream in = new FileInputStream("input.xml");
        XMLStreamReader xsr = XMLInputFactory.newInstance().createXMLStreamReader(in);

        xsr.next();
        while (xsr.hasNext()) {

            xsr.next();
            if(xsr.isStartElement() || xsr.isEndElement())
                 System.out.println(xsr.getLocalName());            
            }
        }

我收到此错误:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

请告诉我如何解决这个问题。我读到StAX很好地处理了大量的XML,但是我得到了与DOM Parsers相同的错误。

3 个答案:

答案 0 :(得分:1)

使用-Xmx参数增加Vm的MaxHeap大小。

java -Xmx512m ....

答案 1 :(得分:1)

在运行JVM时定义堆大小

-Xms    initial java heap size
-Xmx    maximum java heap size
-Xmn    the size of the heap for the young generation

示例:

bin/java.exe -Xmn100M -Xms500M -Xmx500M

答案 2 :(得分:0)

来自维基百科: 传统上,XML API是:

tree based - the entire document is read into memory as a tree structure for random 
access by the calling application
event based - the application registers to receive events as entities are encountered 
within the source document.

StAX was designed as a median between these two opposites. In the StAX metaphor,
the  programmatic  entry point is a cursor that represents a point within the 
document. The application moves the cursor forward - 'pulling' the information from 
the parser as it needs. This is different from an event based API - such as SAX - 
which 'pushes' data to the application - requiring the application to maintain state 
between events as necessary to keep track of location within the document.

因此,对于100M甚至更多 - 我更喜欢SAX - 如果可能使用StAX。

但我在JVM64上尝试了文件大小为2,6GB的代码。没有问题。所以我认为这个问题不是因为文件的大小,而是因为可能是数据。